Open yuritpinheiro opened 3 years ago
I was also unable to identify where you reset the cls
list for each stream.
I ran the script study_perfomance_detectors.py
moving the file cd_naive_bayes.py
into the study folder.
I was thinking that you "dont need" to reset each classifier because scikit-multiflow
uses a dict
format for data and enables changes in data dimension. My code does not support and was crashing due to me using numpy
arrays.
Hey @yuritpinheiro
thank you for reaching out to me. Your first issue could also be solved by setting the python path sys.path.append(your_path)
or in your IDE. However, it made a commit which makes your life easier and adding some of the helper files to the repo.
Comment 2: The second issue is in fact a copy past mistake, i made a new commit with the bug fix. However, i am not sure if this is a problem, because the concept drift detector will descard the naive bayes model immediatly after a new stream starts due to concept drift :)
Comment 3: I am not aware of resets of classifiers if you handle the stream by yourself - i don't think so but see #2.
I am not sure why your code crashes with numpy arrays. In the study_performance_detectors.py
the detectors receive numpy arrrays, too.
Can you provide more details?
Best Christoph
Firstly, thank you your attention. I do appreciate the modifications for improving usage.
For importing cdnb
, I thought you did something like it.
In my second comment, I missed line 30 in study/cd_naive_bayes.py
. Now I see how the detectors are reset When changing the stream. Although, the zip()
in line also in study/cd_naive_bayes.py
maybe behaving in a unexpected fashion and slightly affect the startup at each new stream. A minor detail.
When I mentioned reset a classifier, I really meant the detector, so it already cleared.
Finnaly, I intend to create a n-dimensional concept drift detector. So the detector stores n-dimensional arrays based on the current stream. If the new stream has a different dimension, the operations will fail.
An example would be a sum of vectors. After processing some samples of a data 3 dimensional stream, my detector have a 3 dimensional vector store. When a new stream starts, if it has, e.g., 5 dimensions numpy
will raise an error that I cant add a 3d vector (data stored in the detector) with 5d vector (sample of a new stream).
It is not an issue in your code. It is something that my code must comply.
Again, thank you for your attention!
I'm trying to run the
study_perfomance_detectors.py
but the errorModuleNotFoundError
is raised due the filecd_naive_bayes.py
not being in the study folder.I wondering how you run this code. I know that if I move the files from one folder to another I can solve the error, but if there is a simpler and/or cleaner way to solve I would like to learn.