RubixML / ML

A high-level machine learning and deep learning library for the PHP language.
https://rubixml.com
MIT License
2.03k stars 182 forks source link

Softmax Classifier & partial training #307

Open ElGigi opened 1 year ago

ElGigi commented 1 year ago

Hi,

In the documentation it is stated that partial training can be used to reduce memory consumption.

I tried to train a Softmax classifier with several datasets and partial methods. But only the first labels of the train() method are known. If new labels are present in the dataset given to the partial() method, they are not taken into account.

Can Dataset object retain set of all labels after Labeled::fold() method?

Regards.

andrewdalpino commented 1 year ago

Yes, the first training set defines all the possible labels for the model. If you want to fold your dataset such that each fold has samples that correspond to all possible classes in the master dataset then you can use the straftifiedFold() method.

$folds = $dataset->stratifiedFold(5);

https://docs.rubixml.com/2.0/datasets/labeled.html#stratification