ablab / IsoQuant

Transcript discovery and quantification with long RNA reads (Nanopores and PacBio)
https://ablab.github.io/IsoQuant/
Other
153 stars 14 forks source link

Quantification process not progressing #262

Open AhmedSAHassan opened 5 days ago

AhmedSAHassan commented 5 days ago

Hello,

I am using Isoquant to quantify my bam files, 6 samples (3 treatment Vs 3 control). the process seems to be frozen after some time. in the aux folder, I got OUT.save_chr (_bamstat, _collected, _groups) for all the chromosomes plus over 700 similar files. are these temporary files? the code is running with no progress after that at all.

the code I am using is

isoquant.py --reference reference/GRCh38.primary_assembly.genome.fa.gz \ --genedb reference/gencode.v47.annotation.gtf \ --bam minimap_align/PAU84062_pass_FAST_barcode01/PAU84062_pass_FAST_barcode01.bam minimap_align/PAU84062_pass_FAST_barcode02/PAU84062_pass_FAST_barcode02.bam minimap_align/PAU84062_pass_FAST_barcode03/PAU84062_pass_FAST_barcode03.bam minimap_align/PAU84062_pass_FAST_barcode04/PAU84062_pass_FAST_barcode04.bam minimap_align/PAU84062_pass_FAST_barcode05/PAU84062_pass_FAST_barcode05.bam minimap_align/PAU84062_pass_FAST_barcode06/PAU84062_pass_FAST_barcode06.bam\ --data_type nanopore -o Isoquant/ --complete_genedb\ --model_construction_strategy default_ont --report_novel_unspliced true

This is the output on the terminal

2024-11-27 21:00:12,247 - INFO - Running IsoQuant version 3.6.2 2024-11-27 21:00:12,286 - INFO - === IsoQuant pipeline started === 2024-11-27 21:00:12,286 - INFO - gffutils version: 0.13 2024-11-27 21:00:12,286 - INFO - pysam version: 0.22.1 2024-11-27 21:00:12,286 - INFO - pyfaidx version: 0.8.1.3 2024-11-27 21:00:12,286 - INFO - Checking input gene annotation 2024-11-27 21:00:41,332 - INFO - Gene annotation seems to be correct 2024-11-27 21:00:41,352 - INFO - Converting gene annotation file to .db format (takes a while)... 2024-11-27 21:04:28,878 - INFO - Gene database written to /Users/ahmedsalah/Phybion_Pilot_LR/Isoquant/gencode.v47.annotation.db 2024-11-27 21:04:28,878 - INFO - Provide this database next time to avoid excessive conversion 2024-11-27 21:04:28,879 - INFO - Loading gene database from /Users/ahmedsalah/Phybion_Pilot_LR/Isoquant/gencode.v47.annotation.db 2024-11-27 21:04:28,879 - INFO - Loading reference genome from /Users/ahmedsalah/Phybion_Pilot_LR/reference/GRCh38.primary_assembly.genome.fa.gz 2024-11-27 21:04:36,907 - INFO - Loading uncompressed reference from Isoquant/GRCh38.primary_assembly.genome.fa 2024-11-27 21:04:50,026 - INFO - Processing 1 experiment 2024-11-27 21:04:50,026 - INFO - Processing experiment OUT 2024-11-27 21:04:50,026 - INFO - Experiment has 6 BAM files: minimap_align/PAU84062_pass_FAST_barcode01/PAU84062_pass_FAST_barcode01.bam, minimap_align/PAU84062_pass_FAST_barcode02/PAU84062_pass_FAST_barcode02.bam, minimap_align/PAU84062_pass_FAST_barcode03/PAU84062_pass_FAST_barcode03.bam, minimap_align/PAU84062_pass_FAST_barcode04/PAU84062_pass_FAST_barcode04.bam, minimap_align/PAU84062_pass_FAST_barcode05/PAU84062_pass_FAST_barcode05.bam, minimap_align/PAU84062_pass_FAST_barcode06/PAU84062_pass_FAST_barcode06.bam 2024-11-27 21:04:50,027 - INFO - Collecting read alignments

Thanks in advance

andrewprzh commented 1 day ago

Dear @AhmedSAHassan

Yes, the files in aux are temporary, what is the last modified date?

Some users reported that using a gzipped reference genome can be very time consuming, so you may try running the same command, but with unpacked reference genome FASTA.

Another issue can RAM consumption, you may also try reducing the number of threads via -t option

Hope that helps!

Best Andrey