shunliubio / eTAM-seq_workflow

A workflow for eTAM-seq data processing.
GNU General Public License v3.0
4 stars 2 forks source link

May I ask what parameter we should use to build index for hisat-3n #2

Closed LoneKnightz closed 1 year ago

LoneKnightz commented 1 year ago

Hi Shun,

I have a small question about hisat-3n index.

At script 2, it asks for the path of hisat-3n index but it has not be mentioned at previous script.

May I ask what parameter what should use for hisat-3n build? Should we just use _hisat-3n-build -p $ncpus $genome_fa $hisat3nindex ? Or there is any additional parameter such as --base-change ?

Thanks

shunliubio commented 1 year ago

Hi,

I always build the hisat-3n index according to its manual . Here is one example:

hisat-3n-build -p 24 --base-change A,G --repeat-index --ss gencode.v27.ss --exon gencode.v27.exon genome.fa GRCh38_tran
LoneKnightz commented 1 year ago

Thanks!

llecompte commented 1 year ago

Thank you Shun. I was wondering, in your paper, what rRNA sequence reference did you consider? How did you generate the .ss and .exon files for the rRNAs? hisat-3n-build -p 24 --base-change A,G --repeat-index --ss rRNA.ss --exon rRNA.exon rRNA.fa rRNA_tran

Best, Lolita

shunliubio commented 1 year ago

Hi,

The splice site information is not needed for the rRNA case. So the options --ss and --exon are not used in the rRNA index building.

llecompte commented 1 year ago

Hi, Thank you. Which mouse rRNA sequence did you use in the paper? Did you consider each rRNA sequence as the ones annotated in the GTF or use this rRNA sequence (Genbank BK000964.3), for example? Lolita

shunliubio commented 1 year ago

Hi, I used mouse rRNA sequences from NCBI. Please see the accessions below: 28S: NR_003279.1 18S: NR_003278.3 5.8S: NR_003280.2 5S: NR_030686.1