haowenz / chromap

Fast alignment and preprocessing of chromatin profiles
https://haowenz.github.io/chromap/
MIT License
184 stars 18 forks source link

Bad_alloc for deep Hi-C dataset #23

Open jimwry opened 2 years ago

jimwry commented 2 years ago

Hi,

I've been trying to use chromap for Hi-C data. It is very convenient and super fast!

I have no problem for most Hi-C dataset, however I encountered an "bad_alloc" issue for ultra-deep Hi-C. (more than 1 billion reads). I attached my command line and log below, do you have any idea about solving this issue? Many thanks Jim

CPU: 80 threads, memory: 125GB.

command: chromap --preset hic -x chromap-mm10 -r mm10.fa -1 SRR9906313_GSM4010832_HiC_Retina_Adult-Rep1_Mus_musculus_Hi-C_1.fastq.gz -2 SRR9906313_GSM4010832_HiC_Retina_Adult-Rep1_Mus_musculus_Hi-C_2.fastq.gz -o SRR9906313_HiC_Retina_Adult-Rep1_Hi-C.chromap.pairs -t 64 log : "Mapped all reads in 6054.02s. Number of reads: 2866604954. Number of mapped reads: 2582847256. Number of uniquely mapped reads: 2315287482. Number of reads have multi-mappings: 267559774. Number of candidates: 109967150601. Number of mappings: 2582847256. Number of uni-mappings: 2315287482. Number of multi-mappings: 267559774. terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc"

haowenz commented 2 years ago

Sorry for the late reply. You are right. It seems that there are just too many reads. We will try to fix this soon. Thanks for trying the tool.

jimwry commented 2 years ago

Hi Haowen,

No worries, I have split the reads into smaller input, and run it separately. And it worked well.

Thanks for your attention.

yuzhenpeng commented 2 years ago

Hi Haowen,

No worries, I have split the reads into smaller input, and run it separately. And it worked well.

Thanks for your attention.

How to split the paired reads

yuzhenpeng commented 2 years ago

some one solved this problem?

jimwry commented 2 years ago

You could use fastqsplitter(https://fastqsplitter.readthedocs.io/en/stable/)

yuzhenpeng commented 2 years ago

You could use fastqsplitter(https://fastqsplitter.readthedocs.io/en/stable/)

thank you.