paninski-lab / yass

YASS: Yet Another Spike Sorter
https://github.com/paninski-lab/yass/wiki
Apache License 2.0
64 stars 15 forks source link

trying to build diptest and cuda extension with a single setup.py call #338

Closed edublancas closed 3 years ago

edublancas commented 3 years ago

Hi Cat,

I think we can simplify installation and build diptest + cuda extensions with a single command. I modified the original setup.py to build diptest and the rowshift extension. Unfortunately, I cannot test it because I don't have access to a GPU.

Can you see if the code in the setup branch works?

catubc commented 3 years ago

Hi Edu So I merged this branch and trying to test it live as you didn't make any code changes. But this was obvsioulsy premature.

First, the install script won't work pytorch is not installed.

So we need to add pytorch to the setup file otherwise the CUDA files can't be compiled.

pip install pytorch==1.5

That seems to do better.

Then we get this weird yaml install error:

Installed /home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/python_dateutil-2.8.1-py3.7.egg Searching for pyyaml Reading https://pypi.org/simple/pyyaml/ Downloading https://files.pythonhosted.org/packages/64/c2/b80047c7ac2478f9501676c988a5411ed5572f35d1beff9cae07d321512c/PyYAML-5.3.1.tar.gz#sha256=b8eac752c5e14d3eca0e6dd9199cd627518cb5ec06add0de9d32baeee6fe645d Best match: PyYAML 5.3.1 Processing PyYAML-5.3.1.tar.gz Writing /tmp/easy_install-1n26winp/PyYAML-5.3.1/setup.cfg Running PyYAML-5.3.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-1n26winp/PyYAML-5.3.1/egg-dist-tmp-8qxor7lu In file included from ext/_yaml.c:596:0: ext/_yaml.h:2:10: fatal error: yaml.h: No such file or directory

include

      ^~~~~~~~

compilation terminated. Error compiling module, falling back to pure Python zip_safe flag not set; analyzing archive contents... Moving PyYAML-5.3.1-py3.7-linux-x86_64.egg to /home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages Adding PyYAML 5.3.1 to easy-install.pth file

but yass does seem to install in spite of this.

catubc commented 3 years ago

However, the final run does not seem to work, something from CUDA / pytorch isn't compatible, but I can't seem to find a clear answer:

(yass_test4) cat@cat-GF63-Thin-9SCX:~/Downloads/yass_setup/yass/samples/10chan$ yass sort config.yaml yass.pipeline@run 18/12/2020 09:21:33 INFO YASS version: 2.0 yass.preprocess.run@run 18/12/2020 09:21:33 INFO # of chunks: 6 yass.preprocess.run@run 18/12/2020 09:21:33 INFO Creating temporary folder: tmp/preprocess yass.preprocess.run@run 18/12/2020 09:21:33 INFO Output dtype for transformed data will be float32 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 8.27it/s] yass.preprocess.util@merge_filtered_files 18/12/2020 09:21:34 INFO ...saving standardized file: tmp/preprocess/standardized.bin yass.preprocess.run@run 18/12/2020 09:21:34 INFO Saving params... yass.pipeline@initial_block 18/12/2020 09:21:34 INFO INITIAL DETECTION batch length to (sec): 20 (longer increase speed a bit) length of each seg (sec): 0.5 batch : 0 batch : 1 batch : 2 yass.detect.output@gather_result 18/12/2020 09:21:38 INFO gather detected spikes yass.detect.output@gather_result 18/12/2020 09:21:38 INFO Total 839094 spikes detected yass.detect.output@gather_result 18/12/2020 09:21:38 INFO Total 122633 spikes survived after deduplication yass.pipeline@initial_block 18/12/2020 09:21:38 INFO INITIAL CLUSTERING yass.cluster.run@run 18/12/2020 09:21:38 INFO Split on PTP yass.cluster.ptp_split@run_split_on_ptp 18/12/2020 09:21:38 INFO Get Spike PTP 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 62.47it/s] yass.cluster.ptp_split@run_split_on_ptp 18/12/2020 09:21:39 INFO Run Split yass.cluster.run@run 18/12/2020 09:21:41 INFO load waveforms on local channels yass.cluster.run@run 18/12/2020 09:21:43 INFO NN denoise yass.cluster.run@run 18/12/2020 09:21:44 INFO align waveforms on local channels yass.cluster.run@run 18/12/2020 09:21:47 INFO starting clustering 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:23<00:00, 1.70s/it] yass.cluster.util@gather_clustering_result 18/12/2020 09:22:11 INFO gathering clustering results yass.cluster.util@gather_clustering_result 18/12/2020 09:22:11 INFO units loaded: 95 yass.cluster.util@gather_clustering_result 18/12/2020 09:22:11 INFO reindexing spikes yass.postprocess.run@run 18/12/2020 09:22:11 INFO 95 units are in yass.postprocess.run@post_process 18/12/2020 09:22:11 INFO 95 units after removing off centered units yass.postprocess.run@post_process 18/12/2020 09:22:12 INFO 37 units after removing duplicate units inpput to block2: tmp/block_1/cluster_post_process/templates.npy yass.pipeline@iterative_block 18/12/2020 09:22:12 INFO DECONV ICD TUREND ON ..... ....aligning templates and computing SVD.: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37/37 [00:00<00:00, 309.67it/s] .... computing temp_temp ... 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.33it/s] making template bsplines running deconv from 0 to 60 seconds ... moving coefficients to cuda objects Traceback (most recent call last): File "/home/cat/anaconda3/envs/yass_test4/bin/yass", line 33, in sys.exit(load_entry_point('yass-algorithm==2.0', 'console_scripts', 'yass')()) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/click-8.0.0a1-py3.7.egg/click/core.py", line 1025, in call return self.main(args, kwargs) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/click-8.0.0a1-py3.7.egg/click/core.py", line 955, in main rv = self.invoke(ctx) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/click-8.0.0a1-py3.7.egg/click/core.py", line 1517, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/click-8.0.0a1-py3.7.egg/click/core.py", line 1279, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/click-8.0.0a1-py3.7.egg/click/core.py", line 710, in invoke return callback(args, **kwargs) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/yass_algorithm-2.0-py3.7-linux-x86_64.egg/yass/command_line.py", line 80, in sort calculate_rf=calculate_rf, visualize=visualize)#, File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/yass_algorithm-2.0-py3.7-linux-x86_64.egg/yass/pipeline.py", line 163, in run run_chunk_sec = CONFIG.clustering_chunk) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/yass_algorithm-2.0-py3.7-linux-x86_64.egg/yass/pipeline.py", line 338, in iterative_block run_chunk_sec=run_chunk_sec) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/yass_algorithm-2.0-py3.7-linux-x86_64.egg/yass/deconvolve/run.py", line 111, in run run_chunk_sec) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/yass_algorithm-2.0-py3.7-linux-x86_64.egg/yass/deconvolve/run.py", line 174, in deconv_ONgpu d_gpu = run_core_deconv(d_gpu, CONFIG) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/yass_algorithm-2.0-py3.7-linux-x86_64.egg/yass/deconvolve/run.py", line 275, in run_core_deconv run_core_deconv_parallel(d_gpu, chunk_ids, CONFIG.torch_devices[0].index) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/yass_algorithm-2.0-py3.7-linux-x86_64.egg/yass/deconvolve/run.py", line 304, in run_core_deconv_parallel d_gpu.run(chunk_id) File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/yass_algorithm-2.0-py3.7-linux-x86_64.egg/yass/deconvolve/match_pursuit_gpu_new.py", line 328, in run self.make_objective_shifted_svd() File "/home/cat/anaconda3/envs/yass_test4/lib/python3.7/site-packages/yass_algorithm-2.0-py3.7-linux-x86_64.egg/yass/deconvolve/match_pursuit_gpu_new.py", line 419, in make_objective_shifted_svd rowshift.forward(self.data, shifts_gpu) RuntimeError: matrix.is_contiguous() INTERNAL ASSERT FAILED at src/gpu_rowshift/rowshift.cpp:50, please report a bug to PyTorch. matrix must be contiguous

catubc commented 3 years ago

So for now I manually reverted the setup.py file with the previous one. I should have probably just tested your branch directly, but I was too tired .

So for now I'm hoping we can fix those yaml, and pytorhc install steps.

My best guess is that pytorch is not liking something from gcc version. I will next test instlaling pytorch 1.4 which was the default one recommended in the instructions.

catubc commented 3 years ago

Update: I actually tested with pytroch 1.4 and 1.6 and I get the same error. In fact, even the master installation steps give me the same error which means something else is going wrong. I think the GCC compiler linked is not the right one

edublancas commented 3 years ago

Hi Cat,

Thanks for trying this out. I'll spin up a small gpu server in the cloud to test it, it's gonna make this easier. Will get back to you.

edublancas commented 3 years ago

Hi,

I created a gpu machine in google cloud but had trouble installing the cuda drivers. Is there any gpu machine I can ssh into to test this? That will make things easier.

catubc commented 3 years ago

Hi Edu Yes, I do have access to some GPU machines, one at NYU, and 2 at my home. For now I teamviewer in for most purposes. I will try to set up an ssh account for you and get back to you shortly.

catubc commented 3 years ago

Hi Edu, really sorry about the delay, but I'm a bit swamped until today (Jan 15th). I will try to find a GPU machine for you shortly.

edublancas commented 3 years ago

No worries, I understand. Just ping me when it's ready so I can get to work.

edublancas commented 3 years ago

Hey @catubc I just ran the code on this branch without issues, can you try to reproduce it?

# run this in the setup branch
git checkout setup

# create conda env
conda create --name test python=3.8 --yes
conda activate test

# (did not test with other versions)
export CUDA_HOME=/usr/local/cuda-10.0

pip install torch
pip install .

Then I opened a Python session and did import yass.

catubc commented 3 years ago

Hi Edu I attach the error log, I think it's related to the cuda compilation, but not clear.

Also, it would be good to not have dependencies on cuda toolkits versions and just letting pytorch grab it's own toolkit. But perhaps we can deal with the first issue to start.

Thanks!

yass_error_log.txt