Magdoll / cDNA_Cupcake

Miscellaneous collection of Python and R scripts for processing Iso-Seq data
BSD 3-Clause Clear License
257 stars 104 forks source link

collapse_isoforms_by_sam.py out of memory #250

Open SY348 opened 6 months ago

SY348 commented 6 months ago

Hello, When I run collapse_isoforms_by_sam.py with a big .sam file (~90GB), it ran quite slowly and always ended with an error 'OUT_OF_ME+ 0:125'. Are there any way I can optimize the project? The codes're as follows:

python collapse_isoforms_by_sam.py --input isoseq_flnc.Transcript.fastq --fq -s isoseq_flnc.Transcript.sorted.sam -o TEST

Besides, I've read the example data, and I don't understand why PB4.2 contains two isoforms without containing PB4.3 as all these three belong to the same genomic region? The same problems for 1.1, 1.2 and 1.3. How the project define if the two reads belong to a same loci.index?

Thanks!