DaehwanKimLab / hisat2

Graph-based alignment (Hierarchical Graph FM index)
GNU General Public License v3.0
464 stars 112 forks source link

building standard hisat-3n indices with --base-change but not --repeat-index #403

Closed connertraugot closed 1 year ago

connertraugot commented 1 year ago

I am using hisat2/2.2.1-3n. When I try to build a hisat-3n standard index (non-repeat), hisat-3n-build will not take in the --base-change argument. It is able to take in the --base-change argument only if I include --repeat-index. The documentation indicates that --base-change is required for building hisat-3n indices so I am a bit confused by why this is the case.

Here is my line of code to build the index: hisat-3n-build --base-change C,T /PATH/reference.fa genome

And here is the error I am receiving: hisat2-build: unrecognized option '--base-change'

Any help here would be great.

Thanks! Conner

imzhangyun commented 1 year ago

Hello Conner,

I just test the hisat-3n-build --base-change C,T /PATH/reference.fa genome on my side and it works well.

Could you try it again? Make sure you use hisat-3n-build rather than hisat2-build.

Best, Leo

connertraugot commented 1 year ago

Thanks for the quick response Leo! I am running this on a remote computing cluster where I did not install hisat-3n, but it does appear to be version 2.2.1-3n. I ran again and had the same error. I have posted the whole error message below.

Here is the exact code I ran: hisat-3n-build --base-change C,T /PATH/Homo_sapiens.GRCh38.dna.primary_assembly.fa human.GRCh38.106

Here is the error message:

hisat2-build: unrecognized option '--base-change' HISAT2 version 2.2.1-3n by Daehwan Kim (infphilo@gmail.com, http://www.ccb.jhu.edu/people/infphilo) Usage: hisat2-build [options]* reference_in comma-separated list of files with ref sequences hisat2_index_base write ht2 data to files with this dir/basename Options: -c reference sequences given on cmd line (as

) --large-index force generated index to be 'large', even if ref has fewer than 4 billion nucleotides -a/--noauto disable automatic -p/--bmax/--dcv memory-fitting -p number of threads --bmax max bucket sz for blockwise suffix-array builder --bmaxdivn max bucket sz as divisor of ref len (default: 4) --dcv diff-cover period for blockwise (default: 1024) --nodc disable diff-cover (algorithm becomes quadratic) -r/--noref don't build .3/.4.ht2 (packed reference) portion -3/--justref just build .3/.4.ht2 (packed reference) portion -o/--offrate SA is sampled every 2^offRate BWT chars (default: 5) -t/--ftabchars # of chars consumed in initial lookup (default: 10) --localoffrate SA (local) is sampled every 2^offRate BWT chars (default: 3) --localftabchars # of chars consumed in initial lookup in a local index (default: 6) --snp SNP file name --haplotype haplotype file name --ss Splice site file name --exon Exon file name --repeat-ref Repeat reference file name --repeat-info Repeat information file name --repeat-snp Repeat snp file name --repeat-haplotype Repeat haplotype file name --seed seed for random number generator --3N build 3N index rather than standard hisat2 index --repeat-index-[,-] automatically build repeat database and repeat index, enter the minimum-maximum repeat length pairs (default: 100-300) -q/--quiet disable verbose output (for debugging) -h/--help print detailed description of tool and its options --usage print this usage message --version print version information and quit Error: Encountered internal HISAT2 exception (#1) Command: hisat2-build --wrapper basic-0 --base-change T,C /PATH/reference.fa genome --3N
imzhangyun commented 1 year ago

I guess this is a bug in old hisat-3n version. Could you pull the newest hisat-3n code and try again?

connertraugot commented 1 year ago

After pulling the newest version I had no issues. Thanks for the help Leo!