ctboughter / countASAP

An easy to use python-based package for generating ASAPseq Count Matrices from FASTQ files
MIT License
0 stars 0 forks source link

Memory error #1

Open AbhinavSoni20191995 opened 6 days ago

AbhinavSoni20191995 commented 6 days ago

Hello,

I tried using the package for ASAP-Seq protocol, specifically for ADT and HTO tags. However every time I was running into issue which suggest insufficienct memory:

Example of the error:

finished cell chunk 670/677.0 finished cell chunk 671/677.0 finished cell chunk 672/677.0 finished cell chunk 673/677.0 finished cell chunk 674/677.0 finished cell chunk 675/677.0 finished cell chunk 676/677.0 Traceback (most recent call last): File "/home/crtd_sieweke/abso493b/miniforge3/envs/countASAP/bin/countASAP", line 8, in <module> sys.exit(run()) File "/home/crtd_sieweke/abso493b/miniforge3/envs/countASAP/lib/python3.10/site-packages/countASAP/asap_process.py", line 223, in run seq3 = [str(a.seq) for a in r3] File "/home/crtd_sieweke/abso493b/miniforge3/envs/countASAP/lib/python3.10/site-packages/countASAP/asap_process.py", line 223, in <listcomp> seq3 = [str(a.seq) for a in r3] MemoryError

Code used:

bash-4.2$ countASAP -cr L162127_S1_L001_R2_001.fastq -br L162127_S1_L001_R3_001.fastq -wl barcodes.csv -ref HTO.csv -out HTO -awl False

In barcodes.csv, there are 4220 sequences corresponding to the good quality cells found using cellranger pipeline. There are only two cell hashtag sequences, as I used two antibodies cell hashtag 1 and 2 from total seq A (biolegend)

I ran this code while interactively working on a cluster with this following command:

srun --pty --nodes=1 --ntasks=1 --cpus-per-task=12 --mem=128G --time=8:00:00 bash. I am assuming, this should be sufficient resources?

ctboughter commented 4 days ago

Hello,

Sorry you're running into these issues. Could you just add a bit more information here? Specifically, could you provide the size of each input file (in GB or MB)? I suspect this may be due to the code being a bit sloppy with memory handling. I would bet there are some fairly reasonable fixes I can implement and quickly update to solve this issue.

AbhinavSoni20191995 commented 4 days ago

Thank you for your response! for HTO files: R2 is 7.27GB (L162127_S1_L001_R2_001.fastq) R3 is 17.68GB (L162127_S1_L001_R3_001.fastq)

ADT files (This one I did not try running, assuming I will run into the same size issue): R2 is 52.3GB R3 is 127.26GB