adaptive-machine-learning / CapyMOA

Enhanced machine learning library tailored for data streams, featuring a Python API integrated with MOA backend support. This unique combination empowers users to leverage a wide array of existing algorithms efficiently while fostering the development of new methodologies in both Python and Java.
BSD 3-Clause "New" or "Revised" License
60 stars 20 forks source link

CAND support? #159

Closed Kirstenml closed 1 month ago

Kirstenml commented 1 month ago

Continuously Adaptive Neural Networks (CAND) is not supported yet in CapyMoa and my own implementation using the CapyMoa interface throws the following error:

ai.djl.engine.EngineException: No deep learning engine found.

Here is the code without type checking:

from __future__ import annotations
from capymoa.base import (
    MOAClassifier,
)
from capymoa.stream import Schema
from capymoa._utils import build_cli_str_from_mapping_and_locals
from moa.classifiers.deeplearning import CAND as _MOA_CAND

class ContinuouslyAdaptiveNeuralNetworks(MOAClassifier):
    def __init__(
            self,
            schema: Schema | None = None,
            training_method: str = 'CAND',
            larger_pool: str = 'P10',
            number_of_MLPs_to_train: int = 5,
            number_of_layers_in_each_MLP: int = 4,
            number_of_instances_to_train_all_MLPs_at_start: int = 1000,
            mini_batch_size: int = 32,
            use_one_hot_encode: bool = True,
            use_normalization: bool = True,
            back_prop_loss_threshold: float = 0.3,
            device_type: str = "CPU",
            do_not_train_each_MLP_using_a_separate_thread: bool = False,
            votes_dump_file_name: str = "",
            stats_dump_file_name: str = "",
    ):

        mapping = {
            "larger_pool": "-P",
            "number_of_MLPs_to_train": "-o",
            "number_of_layers_in_each_MLP": "-L",
            "number_of_instances_to_train_all_MLPs_at_start": "-s",
            "mini_batch_size": "-B",
            "use_one_hot_encode": "-h",
            "use_normalization": "-n",
            "back_prop_loss_threshold": "-b",
            "device_type": "-d",
            "do_not_train_each_MLP_using_a_separate_thread": "-t",
            "votes_dump_file_name": "-f",
            "stats_dump_file_name": "-F"
        }

        config_str = build_cli_str_from_mapping_and_locals(mapping, locals())
        super(ContinuouslyAdaptiveNeuralNetworks, self).__init__(
            moa_learner=_MOA_CAND,
            schema=schema,
            CLI=config_str,
        )

Is there any solution for this?

hmgomes commented 1 month ago

Hi @Kirstenml , thanks for raising this problem and providing your code. I believe this problem is related to DJL. I am going to request assistance from the author of the code in MOA, we had similar issue before. @nuwangunasekara do you know how we could solve it?

nuwangunasekara commented 1 month ago

Hi @Kirstenml,

Thanks for sharing the issue!

Would you be able to share the code that you used to invoke ContinuouslyAdaptiveNeuralNetworks python class please?

This would allow me to reproduce the issue.

Kirstenml commented 1 month ago

Thank you for your answer @nuwangunasekara. This is my code to invoke ContinuouslyAdaptiveNeuralNetworks:

from capymoa.datasets import Electricity
from capymoa.evaluation import prequential_evaluation

data_stream = Electricity()
ob_learner = ContinuouslyAdaptiveNeuralNetworks(schema=data_stream.get_schema())
results = prequential_evaluation(data_stream, ob_learner)
print("Continuously Adaptive Neural Networks: {}".format(results["cumulative"].accuracy()))
nuwangunasekara commented 1 month ago

Thanks @Kirstenml, let try to reproduce it and get back to you :)

nuwangunasekara commented 1 month ago

Hi @Kirstenml just updating you with the status of the issue.

Looks like initial issue is caused by not having CUDA 10.2 which is required for DJL 0.9 pytorch engine (1.7.0).

conda install pytorch=1.7.1 torchvision  cudatoolkit=10.2 -c pytorch-lts

would take you bit forward on a Linux system. But still you might get errors :(

I am looking into a work around for this.

Otherwise best option would be to create a native CapyMOA implementation of CAND using pytorch. PyTorchClassifier in 03_pytorch.ipynb notebook could be a good starting point. I am happy to help you in that as well.

Hopefully I could come up with a workaround for your initial issue :)

nuwangunasekara commented 1 month ago

Hi @Kirstenml,

Temporary workaround on MacOS (Intel) (similar steps might work on Linux):

class ContinuouslyAdaptiveNeuralNetworks(MOAClassifier): def init( self, schema: Schema | None = None, training_method: str = 'CAND', larger_pool: str = 'P10', number_of_MLPs_to_train: int = 5, number_of_layers_in_each_MLP: int = 4, number_of_instances_to_train_all_MLPs_at_start: int = 1000, mini_batch_size: int = 32, use_one_hot_encode: bool = True, use_normalization: bool = True, back_prop_loss_threshold: float = 0.3, device_type: str = "CPU", do_not_train_each_MLP_using_a_separate_thread: bool = False, votes_dump_file_name: str = "", stats_dump_file_name: str = "", ):

    mapping = {
        "larger_pool": "-P",
        "number_of_MLPs_to_train": "-o",
        "number_of_layers_in_each_MLP": "-L",
        "number_of_instances_to_train_all_MLPs_at_start": "-s",
        "mini_batch_size": "-B",
        "use_one_hot_encode": "-h",
        "use_normalization": "-n",
        "back_prop_loss_threshold": "-b",
        "device_type": "-d",
        "do_not_train_each_MLP_using_a_separate_thread": "-t",
        "votes_dump_file_name": "-f",
        "stats_dump_file_name": "-F"
    }

    config_str = build_cli_str_from_mapping_and_locals(mapping, locals())
    super(ContinuouslyAdaptiveNeuralNetworks, self).__init__(
        moa_learner=_MOA_CAND,
        schema=schema,
        # CLI=config_str,
    )

from capymoa.datasets import Electricity from capymoa.evaluation import prequential_evaluation

data_stream = Electricity() ob_learner = ContinuouslyAdaptiveNeuralNetworks(schema=data_stream.get_schema()) results = prequential_evaluation(data_stream, ob_learner) print("Continuously Adaptive Neural Networks: {}".format(results["cumulative"].accuracy()))

Kirstenml commented 1 month ago

This works, thank you @nuwangunasekara! For me some more steps were necessary after creating the conda environment:

nuwangunasekara commented 1 month ago

Thanks @Kirstenml for sharing those additional steps.

Please feel free to raise a PR, if you managed to get every thing working without the extra steps in the workaround.