Waikato / moa

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
http://moa.cms.waikato.ac.nz/
GNU General Public License v3.0
603 stars 352 forks source link

Adding support for CAND #258

Closed nuwangunasekara closed 1 year ago

nuwangunasekara commented 2 years ago

This is a preliminary PR for initial review

  1. Commit 17b7b390ba0665c75cab9a6bc152375695394874 adds MLP support with DJL
  2. Commit a120daf10673e5787027ca78aee64002652ee3fa adds CAND support

    TODO

    • [x] reference to the paper (see AdaptiveRandomForest.java or StreamingRandomPatches.java for an example)
    • [x] add a test class (see StreamingRandomPatchesTest.java for an example) `

      T E S T S

      Running moa.integration.SimpleClusterTest Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.17 sec - in moa.integration.SimpleClusterTest Running moa.streams.filters.SelectAttributesFilterTest Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.038 sec - in moa.streams.filters.SelectAttributesFilterTest Running moa.classifiers.trees.EFDTTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.279 sec - in moa.classifiers.trees.EFDTTest Running moa.classifiers.trees.DecisionStumpTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.419 sec - in moa.classifiers.trees.DecisionStumpTest Running moa.classifiers.trees.FIMTDDTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.066 sec - in moa.classifiers.trees.FIMTDDTest Running moa.classifiers.trees.HoeffdingAdaptiveTreeTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.967 sec - in moa.classifiers.trees.HoeffdingAdaptiveTreeTest Running moa.classifiers.trees.RandomHoeffdingTreeTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.404 sec - in moa.classifiers.trees.RandomHoeffdingTreeTest Running moa.classifiers.trees.HoeffdingOptionTreeTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.867 sec - in moa.classifiers.trees.HoeffdingOptionTreeTest Running moa.classifiers.trees.ORTOTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.052 sec - in moa.classifiers.trees.ORTOTest Running moa.classifiers.trees.ASHoeffdingTreeTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.673 sec - in moa.classifiers.trees.ASHoeffdingTreeTest Running moa.classifiers.trees.LimAttHoeffdingTreeTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.399 sec - in moa.classifiers.trees.LimAttHoeffdingTreeTest Running moa.classifiers.trees.AdaHoeffdingOptionTreeTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.787 sec - in moa.classifiers.trees.AdaHoeffdingOptionTreeTest Running moa.classifiers.trees.HoeffdingTreeTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.161 sec - in moa.classifiers.trees.HoeffdingTreeTest Running moa.classifiers.neuralNetworks.CANDTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 56.387 sec - in moa.classifiers.neuralNetworks.CANDTest Running moa.classifiers.neuralNetworks.MLPTest Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.205 sec - in moa.classifiers.neuralNetworks.MLPTest Running moa.classifiers.meta.imbalanced.OnlineAdaC2Test Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.343 sec <<< FAILURE! - in moa.classifiers.meta.imbalanced.OnlineAdaC2Test testRegression(moa.classifiers.meta.imbalanced.OnlineAdaC2Test) Time elapsed: 0.322 sec <<< ERROR! java.lang.NullPointerException: Cannot invoke "java.util.ArrayList.add(Object)" because "this.adwinEnsemble" is null ` CAND and MLP tests are passing. They take a bit more time than the previous ones. But some of the later tests are failing. I doubt it is due to CAND and MLP tests.

Testing

  1. EvaluatePrequential -l (neuralNetworks.CAND -h -n) -s (ArffFileStream -f elecNormNew.arff)
  2. EvaluatePrequential -l (neuralNetworks.CAND -h -n) -s (ArffFileStream -f elecNormNew.arff) -f 1000 -q 1000 MOA_testing
nuwangunasekara commented 1 year ago

Failing tests are due to moa.classifiers.meta.imbalanced tests failing (issue #260).

nuwangunasekara commented 1 year ago

Can we have the two classifiers in classifiers/functions like in https://github.com/Waikato/wekaDeeplearning4j/tree/master/src/main/java/weka/classifiers/functions ?

Sure!

We already have a separate classifier for MLP (moa/src/main/java/moa/classifiers/neuralNetworks/MLP.java)

Were you thinking of another one for RNNs?

nuwangunasekara commented 1 year ago

Can we have the two classifiers in classifiers/functions like in https://github.com/Waikato/wekaDeeplearning4j/tree/master/src/main/java/weka/classifiers/functions ?

As per the offline conversation, renamed the folder 'neuralNetworks' to 'deeplearning' in commit 4457873.