Closed woodoo46 closed 3 years ago
@woodoo46 You can download a TRF bed file from UCSC genome browser, and then use it as input of RepeatHMM. An example command is below:
nohup python RepeatHMM/bin/repeatHMM.py Scan --SplitAndReAlign 1 --MinSup 3 --UserDefinedUniqID WGSscan --SeqTech "Nanopore" "--Patternfile" trf.bed --cluster 1 --envset repeathmmenv --Onebamfile hx1_bam/hx1_nanopore_all_data_0926.minimap2.sorted.bam --hgfile GRCh38/GRCh38.fa --thread 50 > log/hx1.scan.test.log &
If you do not use cluster setting, please replace --cluster 1
with --cluster 0
.
It would be helpful if you can remove those repeat regions in failed regions from your TRF bed file to void some complicated regions ( I will upload the file later).
One more question, does the aligner matter? Can I use ngmlr alignment for the input?
@woodoo46 Your input BAM file can be generated by ngmlr. I do not estimate how aligner affect the results, and the effect should not be significant. You are welcome to share your finding when different aligners are used.
Hi there,
I would like to run RepeatHMM Scan across whole genome, I suppose I need to create a file like "hg38.predefined.pa"? If so, can you share yours used in your 2020 paper?
Thanks.
George