WGLab / RepeatHMM

a hidden Markov model to infer simple repeats from genome sequences
Other
34 stars 14 forks source link

Input of UserDefinedRepeat #23

Closed zthornton96 closed 5 years ago

zthornton96 commented 5 years ago

I am trying to use RepeatHMM to look at repeats in c9orf72. This is the line of code I am using, but I am receiving the error: None gene/repeat information is given. I have looked at previously closed issues, but I can't seem to resolve from the advice you have given.

python repeatHMM.py BAMinput --Onebamfile ~/90499/pass/hg38.bam --hgfile ~/hg38/seq/hg38.fa --UserDefinedRepeat c9orf72,chr9,27544546,27575866,GGGGCC,+,,

liuqianhn commented 5 years ago

Hi @zthornton96 , the parameter of "--UserDefinedRepeat" would be "c9orf72/chr9/27544546/27575866/GGGGCC/+//".

zthornton96 commented 5 years ago

Even with these amendments, I am still getting the same error.

python repeatHMM.py BAMinput --Onebamfile ~/90499/pass/hg38.bam --hgfile ~/hg38/seq/hg38.fa --UserDefinedRepeat c9orf72/chr9/27544546/27575866/GGGGCC/+//

liuqianhn commented 5 years ago

Hi @zthornton96 , could you please share the log file? Thank you.

zthornton96 commented 5 years ago

HI @liuqianhn, I hope this is what you need (I'm very new to bioinformatics)

repeatHMM.txt

liuqianhn commented 5 years ago

Hi @zthornton96 , please use command below:

python repeatHMM.py BAMinput --Onebamfile /shared/bioinformatics_core1/Shared/MSC_STUDENT/zathornton1/NanoSatellite/90499/pass/hg38.bam --hgfile /usr/local/community/bcbio-nextgen/2017-08/data/genomes/Hsapiens/hg38/seq/hg38.fa --repeatName c9orf72 --UserDefinedRepeat chr9/27573486/27573542/GGGGCC/-10//

repeatName would be given separately. Meanwhile, --UserDefinedRepeat needs a precise location of the repeat rather than genome locations of genes (your original commands used the genome location of the gene as input which is not correct and you would get nothing). Also, this repeat GGGGCC is in the reverse strand, so, '-' rather than '+' would be used. If you want to use '+', the pattern is 'CCCCGGG'.

zthornton96 commented 5 years ago

That's worked great. Thanks very much for your help @liuqianhn