Closed sme229 closed 1 month ago
Hi @sme229 Is it possible that a coordinate in your reference gtf file refers to a location that does not exist in your reference sequences?
The size of the mitochondrial genome NC_001913.1
is 17,245 bp.
Are there any end coordinates for NC_001913.1
in your GTF file that extend further than this?
Hi @nrhorner Thanks for your response. No, there are no coordinates that extend beyond 17,245 in my gtf file:
@nrhorner I removed the NC_001913.1 entry from the GTF file to see if that helps Update: same type of error gffread -g ref_genome.fa -w "transcriptome.fa" "stringtie.gff"
Command exit status: 1
Command output: (empty)
Command error: [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 3 reads GffObj::getSpliced() error: improper genomic coordinate 61536 on NW_026260071.1 for NW_026260071.1.stringtie.1.1
Hi @nrhorner Let me provide more information please. When I get this error:
GffObj::getSpliced() error: improper genomic coordinate 37565 on NW_026260069.1 for NW_026260069.1.stringtie.1.1
I can see that there is a duplication in the resulting stringtie.gff file:
I'm not sure if that's how it should be? upd. I can see that they are not exactly the same, sorry.
HI @sme229
Sorry for the late reply. Would it be possible to share the work directory with me? /datasets/work/hb-rabbit-gbc/work/Maria_Jenkel_ONT_Data_04-07-2024/work/f2/e1eeb8d9fe3e7f883fe255830c3c27
I can send a link for you to drop that into.
Hi @nrhorner
Sure, happy to do that. I have since re-run the pipeline but got the same error: Command error: [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 1448 reads GffObj::getSpliced() error: improper genomic coordinate 37565 on NW_026260069.1 for NW_026260069.1.stringtie.1.1
Work dir: /datasets/work/hb-rabbit-gbc/work/Maria_Jenkel_ONT_Data_04-07-2024/work/ed/563d7d63a056ec563b717e48060582
So the directory has a different name.
Hi @sme229
I need an email address in order to grant access to a shared folder. Do you happen to have a linkedin profile or somewhere else I can message you to get your email address?
Hi @nrhorner Sure, here is my email address Elena.Smertina@csiro.au
Operating System
Other Linux (please specify below)
Other Linux
Linux 5.14.21-150400.24.28-default
Workflow Version
wf-single-cell v2.1.0-ga25ec6c
Workflow Execution
Command line (Cluster)
Other workflow execution
No response
EPI2ME Version
No response
CLI command run
nextflow run epi2me-labs/wf-single-cell --expected_cells 3000 -profile singularity --fastq 'Elena-cDNA-2/20240624_1312_1D_PAW58377_76308a2c/fastq_pass' --kit_name '3prime' --kit_version 'v3' --ref_genome_dir 'OryCun_genome'
Workflow Execution - CLI Execution Profile
singularity
What happened?
I get this error after ~5 h of execution: Caused by: Process
pipeline:process_bams:stringtie (1)
terminated with an error exit status (1)Command executed: Add chromosome label (-l) to generated transcripts so we don't get name collisions during file merge later samtools view -h align.bam NC_001913.1 | tee >( stringtie -L -c 2 -p 8 -G chr.gtf -l "NC_001913.1.stringtie" -o "stringtie.gff" - ) | samtools fastq | bgzip --threads 2 -c > reads.fastq.gz Get transcriptome sequence gffread -g ref_genome.fa -w "transcriptome.fa" "stringtie.gff"
Command exit status: 1 Command output: (empty) Command error: [M::bam2fq_mainloop] discarded 0 singletons [M::bam2fq_mainloop] processed 333717 reads GffObj::getSpliced() error: improper genomic coordinate 1026 on NC_001913.1 for NC_001913.1.stringtie.1.1
I generated the reference and gtf file with 10x as suggested. I then renamed the headers in the reference file to keep only the IDs with no description. I'm not sure why it says 'improper genomic coordinate 1026 on NC_001913.1 for NC_001913.1.stringtie.1.1' and how could I fix this?
Relevant log output
Application activity log entry
No response
Were you able to successfully run the latest version of the workflow with the demo data?
yes
Other demo data information
No response