gpertea / stringtie

Transcript assembly and quantification for RNA-Seq
MIT License
361 stars 76 forks source link

input file format #408

Open dvirdi01 opened 9 months ago

dvirdi01 commented 9 months ago
  1. I have a bunch of Nanopore long-read fastq files that I need to assembly using stringtie. I converted them to bam files using minimap2 and samtools: here is what I ran-

         minimap2 -t {threads} -a -x map-ont {input.genome} {input.fastq}  | \
         samtools sort -o {output} -@ {threads} - \

So the input files I have now are in sample.sorted.bam format. It says that for long-read data, "the -L option must be used when the input alignment file contains (sorted) spliced alignments of long read RNA-seq or cDNA reads. Such alignments can be produced by minimap2 with the **-ax splice** option, which also generates the necessary ts tag to indicate the transcription strand. "

However, I did not use the ax splice option while converting the files. Can I still use the -L option? Or do I have to use the ax-splice? My input files haven't been alternatively spliced yet, so how would I use stringtie with just normal long-read fastq file?

Also I have a bunch of sample files I need to run stringtie on. If I use snakemake and give it all input files, would it create one final assembled gtf file or one gtf file for each input file (desired output)?

  1. From my understanding, I am not sure but I think running this command to convert my raw Nanopore fastq files to bam files produces raw bam files. minimap2 -t {threads} -a -x map-ont {input.genome} {input.fastq} | \ samtools sort -o {output} -@ {threads} - \ I ran stringtie -L -o data/.../sample/sample.gtf data/.../.../sample.sorted.bam ---> Correct me if I am wrong but will this do a de novo raw long-read assembly (i.e not alternatively spliced assembly even though I used -L without using minimap2 -splice option)

And if I ran minimap2 originally using this command: minimap2 -t {threads} -a -x map-ont splice {input.genome} {input.fastq} samtools sort -o {output} -@ {threads} - \

-It would create alternatively spliced alignment bam files. So when I run stringtie on these files using the command- stringtie -L -o data/.../sample/sample.gtf data/.../.../sample.sorted.bam it will do an alternatively spliced long-read assembly.

Is my understanding correct? I am just trying to figure out how the minimap2 command will change my end result. Any help would be appreciated