Closed Mirror1211 closed 1 year ago
Yes, StringTie works the same way on single-end data (no change in the parameters), though of course there may be some usually obvious parameter changes you have to apply to other programs in your analysis pipeline in order to specify single-end reads (like you did above using -U
for hisat2).
Paired-end reads are preferred/recommended for multiple reasons - e.g. not only improving the accuracy of the alignments but also providing additional structural hints for StringTie during transcript assembly that improve the quality of the assembled transfrags.
Hi,
Recently, I tried to use StringTie to carry out the gene-level quantification for 2000+ transcriptome data, including both pair-end and single-end reads. I used HISAT2 to map the reads onto the referencing genomes and generate the bam files.The commandlines are as follows:
For pair-end reads:
hisat2 --dta -p 20 -x reference_hisat2 -1 sample1_1.fq.gz -2 sample1_2.fq.gz|samtools sort - > sample1_pair_end.sorted.bam
stringtie -p 20 -e -A sample1_gene_abund.tab -C sample1_gene_abund.gtf -G reference.gtf -o sample1.gtf sample1_pair_end.sorted.bam
For single-end reads:
hisat2 --dta -p 20 -x reference_hisat2 -U sample2.fq.gz|samtools sort - > sample2_single_end.sorted.bam
stringtie -p 20 -e -A sample2_gene_abund.tab -C sample2_gene_abund.gtf -G reference.gtf -o sample2.gtf sample2_single_end.sorted.bam
Because the tutorials from github and other websits mainly focus on bam files or pair-end data, I have no idea whether the pipelines for single-end data are appropriate. If not, how should I conduct the quantification for single-end reads using HISAT2 + StringTie?
Sincerelly