bcgsc / NanoSim

Nanopore sequence read simulator
Other
246 stars 57 forks source link

Failure in using Nanosim for transcriptome (ValueError: file does not contain alignment data) #208

Open aerusakovich opened 6 months ago

aerusakovich commented 6 months ago

Hello!

I tried to use docker container for nanosim to generate transcripts. Here is the command I used (mouse genome):

read_analysis.py transcriptome -i "/groups/dog/anastasia/data/data_ciri-long/CRR194209.fq.gz" -rg "/groups/dog/anastasia/nanosim/GRCm39.genome.fa.gz" -rt "/groups/dog/anastasia/nanosim/gencode.vM34.transcripts.fa.gz" -annot "/groups/dog/anastasia/nanosim/gencode.vM34.annotation.gtf.gz" -t 8

When I run it I get this error with minimap2:

2024-04-29 12:12:03: Processing genome alignment file: bam
Traceback (most recent call last):
  File "/usr/local/bin/read_analysis.py", line 751, in <module>
    main()
  File "/usr/local/bin/read_analysis.py", line 697, in main
    align_transcriptome(in_fasta, prefix, aligner, num_threads, t_alnm, ref_t, g_alnm, ref_g, chimeric,
  File "/usr/local/bin/read_analysis.py", line 144, in align_transcriptome
    get_primary_sam.primary_and_unaligned(g_alnm, prefix + "_genome")
  File "/usr/local/bin/get_primary_sam.py", line 146, in primary_and_unaligned
    in_sam_file = pysam.AlignmentFile(sam_alnm_file)
  File "pysam/libcalignmentfile.pyx", line 748, in pysam.libcalignmentfile.AlignmentFile.__cinit__
  File "pysam/libcalignmentfile.pyx", line 953, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file does not contain alignment data

And similar error with LAST:

(base) [aerusakovich@cl1n016:last] $ singularity shell -B /groups/dog/anastasia/ docker://quay.io/biocontainers/nanosim:3.1.0--hdfd78af_0
INFO:    Using cached SIF image
Apptainer> read_analysis.py transcriptome  -i "/groups/dog/anastasia/data/data_ciri-long/CRR194209.fq.gz" -rg "/groups/dog/anastasia/nanosim/GRCm39.genome.fa.gz" -rt "/groups/dog/anastasia/nanosim/gencode.vM34.transcripts.fa.gz" -annot "/groups/dog/anastasia/nanosim/gencode.vM34.annotation.gtf.gz" -t 8 -a LAST          
^[[D
running the code with following parameters:

infile /groups/dog/anastasia/data/data_ciri-long/CRR194209.fq.gz
ref_g /groups/dog/anastasia/nanosim/GRCm39.genome.fa.gz
ref_t /groups/dog/anastasia/nanosim/gencode.vM34.transcripts.fa.gz
annot /groups/dog/anastasia/nanosim/gencode.vM34.annotation.gtf.gz
aligner LAST
g_alnm 
t_alnm 
prefix training
num_threads 8
model_fit True
intron_retention True
chimeric False
quantification False
normalize by transcript length False
2024-05-06 10:31:30: Read pre-process and unaligned reads analysis
2024-05-06 10:32:19: Alignment with LAST to reference transcriptome
2024-05-06 11:38:55: Processing transcriptome alignment file: maf
2024-05-06 12:07:48: Alignment with LAST to reference genome
2024-05-06 15:44:30: Processing genome alignment file: maf
2024-05-06 17:34:08: Parse the annotation file (GTF/GFF3)
warning: line 1 in file "training.gff3" does not begin with "##gff-version" or "##gvf-version", create "##gff-version 3" line automatically
gt gff3: error: line 1 in file "training.gff3" does not contain 9 tab (\t) separated fields
free(): double free detected in tcache 2
/bin/sh: line 1:  6535 Aborted                 gt /usr/local/bin/bequeath.lua transcript_id < training_added_intron_temp.gff3 > training_added_intron_final.gff3
2024-05-06 17:34:10: Modeling Intron Retention
2024-05-06 17:34:10: Reading intron coordinates from GFF file
2024-05-06 17:34:10: Read primary genome alignment for each read
Traceback (most recent call last):
  File "/usr/local/bin/read_analysis.py", line 751, in <module>
    main()
  File "/usr/local/bin/read_analysis.py", line 706, in main
    model_ir.intron_retention(prefix, prefix + "_added_intron_final.gff3", g_alnm, t_alnm)
  File "/usr/local/bin/model_intron_retention.py", line 74, in intron_retention
    g_alignments = pysam.AlignmentFile(g_alnm)
  File "pysam/libcalignmentfile.pyx", line 748, in pysam.libcalignmentfile.AlignmentFile.__cinit__
  File "pysam/libcalignmentfile.pyx", line 953, in pysam.libcalignmentfile.AlignmentFile._open
ValueError: file does not contain alignment data

Could you pls suggest how to fix it? Thank you in advance!