sgoldenlab / simba

SimBA (Simple Behavioral Analysis), a pipeline and GUI for developing supervised behavioral classifiers
https://simba-uw-tf-dev.readthedocs.io/
GNU General Public License v3.0
289 stars 141 forks source link

Unsupervised Learning Missing In Anaconda Installation #278

Open nsusic opened 1 year ago

nsusic commented 1 year ago

Describe the bug There is an issue I am experiencing with trying to access the unsupervised learning SIMBA Expansion. It appears that when downloading SIMBA using the anaconda installation, the files/code for the unsupervised learning was not installed. How can I go about installing the unsupervised learning code in order to run unsupervised learning on my current project. Thank you for your help. I look forward to your response.

To Reproduce Steps to reproduce the behavior:

  1. Install SIMBA using anaconda method.
  2. Access existing project.
  3. Locate SIMBA Expansions.
  4. See error: There is no existing Unsupervised learning button as pictured in tutorial.

Expected behavior I expected a GUI button for the unsupervised learning add-on.

Screenshots Screenshot 2023-08-02 135301

Screenshot 2023-08-02 135204

Screenshot 2023-08-02 135106

Desktop (please complete the following information):

sronilsson commented 1 year ago

Hi @nsusic,

You're right. Although I typed up the tutorial, and written the code, I have been a little hesitant starting to support it widely because (ii) the likely extra time commitment and (ii) the additional dependencies required that would fall on all simba users, even if they are not interested in unsupervised learning. So I have commented it out for now. I can give you some hints to get it running:

To just find the unsupervised code, try pip show simba-uw-tf-dev in your conda enviroment, it should show you where the simba package is located:

image

You should find a sub-directory called unsupervised in that folder that has most of the code. Some unsupervised functions are within the mixins.unsupervised.py file.

To activate the buttons and get it running in the GUI, uncomment this line from SimBA.py line 132:

image

Then uncomment the unsupervised button definition line 443 in SimBA.py:

image

Lastly, uncomment the line that inserts the unsupervised button in the GUI line 558 in SimBA.py:

image

Finally, you must have HDBSCAN and UMAP installed in your conda environment. Ideally from the rapids library, but pip install umap hdbscan will also work.

You should be good to go! Let me know how it goes!

Simon

nsusic commented 1 year ago

Hi Simon,

Thank you for looking into the matter for me and for providing the directions. I had a fellow lab member try the fix, but they were unable to access SIMBA after utilizing the fix. Attached below is the error message they received when running the program:

  from cuml.cluster.hdbscan import HDBSCAN

ModuleNotFoundError: No module named 'cuml'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "C:\Users\aranedalab\anaconda3\envs\simba\lib\runpy.py", line 193, in _run_module_as_main

    "__main__", mod_spec)

  File "C:\Users\aranedalab\anaconda3\envs\simba\lib\runpy.py", line 85, in _run_code

    exec(code, run_globals)

  File "C:\Users\aranedalab\anaconda3\envs\simba\Scripts\simba.exe\__main__.py", line 4, in <module>

  File "C:\Users\aranedalab\anaconda3\envs\simba\lib\site-packages\simba\SimBA.py", line 132, in <module>

    from simba.unsupervised.unsupervised_ui import UnsupervisedGUI

  File "C:\Users\aranedalab\anaconda3\envs\simba\lib\site-packages\simba\unsupervised\unsupervised_ui.py", line 19, in <module>

    from simba.unsupervised.pop_up_classes import (GridSearchClusterVisualizerPopUp,

  File "C:\Users\aranedalab\anaconda3\envs\simba\lib\site-packages\simba\unsupervised\pop_up_classes.py", line 28, in <module>

    from simba.unsupervised.hdbscan_clusterer import HDBSCANClusterer

  File "C:\Users\aranedalab\anaconda3\envs\simba\lib\site-packages\simba\unsupervised\hdbscan_clusterer.py", line 8, in <module>

    from hdbscan import HDBSCAN

  File "C:\Users\aranedalab\anaconda3\envs\simba\lib\site-packages\hdbscan\__init__.py", line 1, in <module>

    from .hdbscan_ import HDBSCAN, hdbscan

  File "C:\Users\aranedalab\anaconda3\envs\simba\lib\site-packages\hdbscan\hdbscan_.py", line 21, in <module>

    from ._hdbscan_linkage import (single_linkage,

  File "hdbscan\\_hdbscan_linkage.pyx", line 1, in init hdbscan._hdbscan_linkage

ModuleNotFoundError: No module named 'dist_metrics'

They also tried installing the missing modules using pip but then receive an error that states to use the rapidsai library, which they found is only compatible with python 3.9.

What do you suggest we do? Was it an error on our end in following your instructions?

Thank you for your help, I genuinely appreciate it and look forward to hearing from you.

sronilsson commented 1 year ago

Hi @nsusic - thanks for testing this. Best solution would probably be me creating some install instructions (for with and without rapids library) that comes with the conda enviroment yaml files for getting it to run on the GPU or CPU?

A little crazy developing this code, but I don't have readily access to a GPU, weekends I can normally get access so hold on a few days. Remind me late next week if not done. There are some runtime comparisons HERE or HERE, and CPU is not possible for large datasets or when grid-searching many models.

PS. Regardless of what docs say later versions of SimBA should run on python3.9.

nsusic commented 1 year ago

Hi Simon, I hope everything has been going well. Have you had a chance to make installation instructions (for with and without rapids library)? If not, no problem but I just thought I would check in as requested. Thank you again for your help!

sronilsson commented 1 year ago

Hi @nsuic - tried getting GPU but no luck sorry! I will keep trying next week.

For the CPU, I couldn't recreate your error, but I created a conda env yaml file below that you could use and launches on my end.

Unzip it, and in the terminal navigate to dir where the yaml is located. Then run:

conda env create --name MyUnsupervisedEnvironmentName --file environment_simba_unsupervised.yml

environment_simba_unsupervised.yml.zip

When in the MyUnsupervisedEnvironmentName, remember to remove the commented lines as discussed previously in simba.py before launching simba with simba.

Let me know if there are problems!

PS. RAPIDS has a conda or pip install-command creater HERE where you click in the cuda and puython versions etc that I typically use for fresh installs but it won't work for Microsoft Windows.