elixir-nx / scholar

Traditional machine learning on top of Nx
Apache License 2.0
418 stars 43 forks source link

fit_partial API #260

Closed krstopro closed 5 months ago

krstopro commented 5 months ago

For the past few days I have been trying to implement incremental PCA (#246, Task 1). One of the functions that should be implemented in the module is fit_partial that takes a model and a dataset and updates the model parameters. This is useful when the dataset itself cannot fit inside the memory and we must update the model batch by batch. However, the function itself assumes that the model is already created and can be passed as an argument. This means that there should be a way to create an initial model before feeding it the very first batch.

What would be the cleanest way to do this? I was thinking of adding a function new/1 to the module, but that slightly changes the fit/predict logic. Another way would be to require using fit on the first batch and then using fit_partial on the rest.

Any thoughts on this?

josevalim commented 5 months ago

What about fit_stream that receives a stream and we implement the loop ourselves?

krstopro commented 5 months ago

What about fit_stream that receives a stream and we implement the loop ourselves?

Yes, that is another possibility. Does deftransform allow for stream arguments?

krstopro commented 5 months ago

Or it doesn't matter since it is a stream (lazy eval) and we can just use def? :)

josevalim commented 5 months ago

You can just use def, yeah. :)

krstopro commented 5 months ago

Thanks, closing the issue. ^_^