DaehwanKimLab / hisat2

Graph-based alignment (Hierarchical Graph FM index)
GNU General Public License v3.0
473 stars 116 forks source link

HISAT2 -p: producing bam files of different size #196

Open laurabuggiotti opened 5 years ago

laurabuggiotti commented 5 years ago

Hi,

I have been using hisat2 happily to map several rna-seq data as follows:

/hisat2 -x ${reference}/ARS-UCD1.2 -1 ${input}/${sample}_1.fq.gz -2 ${input}/${sample}_2.fq.gz -p 10 -t --known-splicesite-infile ${reference}/ARS-UCD1.2_splicesites.txt --met 1 --dta | samtools view -@10 -bS | samtools sort -@ 10 -o ${out}/$sample\.bam

however i got few samples which did produce bam file with a very small size and i decided to rerun it not in parallel (-p1). The fastq.gz were around 3.5Gb and I got a bam file of 2.8Gb when -p1, 2.6GB when -p2 and 1.7Gb when -p10. I got worried and tried additional samples and they all behaved in the same way, getting bigger bam files when -p1. I checked the counts and obviously the differ as well. I dont know what to do now, shall i run all samples again -p1? Is there something wrong in the script? Is there a justification for this kind of behaviour?

I really hope you can shed the light as at the moment i dont see any! Thanks for your time and support and hope to hear from you soon, cheers, Laura

parkchanhee commented 5 years ago

Hi, Laura

Could you run hisat2 with --no-temp-splicesite option? This option may affect the result when running hisat2 in multithread mode.

Thank you Chanhee

laurabuggiotti commented 5 years ago

thanks for your answer! our HPC is not working at the moment...will try as soon as it get restored and will let you know, thanks