FZJ-JSC / JuML

3 stars 1 forks source link

Intelligent (Batched) Memory Usage #60

Open cbodenst opened 8 years ago

cbodenst commented 8 years ago

By now, JuML just loads all the data onto the device and do not care if enough memory is left. It would be smarter to enable batched processing, where the algorithm pulls just a batch from the dataset that fits into the devices memory, to its computation, repeats this for all batches and finally performing a reduction.

pglock commented 8 years ago

So an Algorithm implements a partial_fit method and Dataset provides load_batch for example?

cbodenst commented 8 years ago

Yes the dataset should provide a load_batch method. I think a partial_fit is not possible for all algorithms (For neural netwoks with sgd it should work for k-mean not) so the algorithm itself has to decide how to deal with the batches. Maybe the user could define the batch size as algorithm parameter.