velten-group / crispat

MIT License
8 stars 0 forks source link

Multiprocessing freeze_support error #6

Open Saranya-Balachandran opened 2 weeks ago

Saranya-Balachandran commented 2 weeks ago

Dear team, Thank you for the tool. Upon using the negative binomial approach, I am error with the multiprocessing freeze

image

JanaBraunger commented 2 weeks ago

Hi Saranya, nice to hear that you are using crispat. Since I haven't seen this error before, I am not sure what's going wrong here and would need some more details to help you troubleshoot. Have you tried running the negative binomial assignment function in our tutorial Jupyter notebook (https://github.com/velten-group/crispat/blob/main/tutorials/guide_assignment.ipynb) and does it work there? Also, do you get the same error when running the binomial and poisson assignment? Additionally, I would recommend when using the parallelisation option on a cluster to manually specify the "n_processes" parameter to be sure that it matches the number of available/requested cores. An example python script that we used for running the negative binomial assignment can be found in our crispat_analysis repository (https://github.com/velten-group/crispat_analysis/blob/main/python/guide_assignment/negative_binomial.py)

Saranya-Balachandran commented 2 weeks ago

Hello Jana, I do not get error with UMI, gauss and poisson_gauss, , the execution stops at this point but continues further, but with negative binomial and poisson it is stuck at this for more than 20 hrs now. Note: I am using these methods as my data is high moi.

JanaBraunger commented 1 week ago

Hi Saranya,

that makes sense since UMI, gauss and poisson_gauss are not using parallelisation (via dask cluster). How large is your data set (how many cells and gRNAs?)? And how exactly are you calling the 'ga_negative_binomial' function? Have you tried running our tutorial Jupyter notebook (https://github.com/velten-group/crispat/blob/main/tutorials/guide_assignment.ipynb) or call the function as shown in our crispat analysis repository (https://github.com/velten-group/crispat_analysis/blob/main/python/guide_assignment/negative_binomial.py) and does the error persist there?

If your data set is not too big, you can also set the parameter 'parallelize = False' to disable the parallelisation that seems to create an issue for you. If it is larger though and would take too long without parallelisation, there is another alternative to do parallelisation manually (set 'parallelize = False'). If you are working on a high-performance cluster, you can split the assignment task into multiple jobs by using the 'start_gRNA' and 'gRNA_step' parameters of the assignment function. These two parameters allow you to run the assignment on a subset of gRNAs (it takes 'gRNA_step' many gRNAs starting from index 'start_gRNA'). So, by submitting multiple jobs with different 'start_gRNA' settings and combining the resulting data frames in the end (e.g. using our 'combine_assignments' function) you can also obtain the full assignment faster.

Saranya-Balachandran commented 5 days ago

Hello Jana, I used the /guide_assignment.ipynb) previously and landed on this error. I now used negative_binomial.py and I get the error below:

Traceback (most recent call last): File "/data/humangen_mouse/test_area/Saranya/Crispr_scripts/assign_guide_with_crispat.py", line 47, in crispat.ga_negative_binomial("gRNA_counts.h5ad", File "/work/balachandran/.omics/anaconda3/envs/crispat/lib/python3.10/site-packages/crispat/neg_binomial.py", line 274, in ga_negative_binomial adata_crispr = sc.read_h5ad(input_file) File "/work/balachandran/.omics/anaconda3/envs/crispat/lib/python3.10/site-packages/anndata/_io/h5ad.py", line 237, in read_h5ad with h5py.File(filename, "r") as f: File "/work/balachandran/.omics/anaconda3/envs/crispat/lib/python3.10/site-packages/h5py/_hl/files.py", line 562, in init fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr) File "/work/balachandran/.omics/anaconda3/envs/crispat/lib/python3.10/site-packages/h5py/_hl/files.py", line 235, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 102, in h5py.h5f.open BlockingIOError: [Errno 11] Unable to synchronously open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')