Closed liutiming closed 5 years ago
Just FYI, I am not sure if the error was due to the edits I have made to the MAKEFILE (shown in the pull request)...
I think it might be a version issue. I will first try to create a virtual environment for python2.7 to re-install all the dependencies and update here.
Thanks!
Updates:
I have created the py27 virtual environment, installed the dependencies, make the MAKEFILE and run the software with the following script:
python /mnt/c/np/software/RepeatHMM/bin/repeatHMM.py BAMinput --Onebamfile amp_combined_long.sorted.bam --repeatNam e FMR1 --hgfile /mnt/c/np/reference/grch38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --hg hg38
Still the same error appeared: ('Error no information for 0\nError no information for 1\nError no information for 2\nError no information for 3\nError no information for 4\n', ['', '', '', '', '', '', ''], 'fmr1')
Hi @6timings , Thank you for being interested in our tool.
It seems that you did not run the command at the RepeatHMM-bin folder. Thus, you might need to provide the full path of the predefined repeat patterns: "--Patternfile $RepeatHMM-bin-folder$/reference_sts/hg38/hg38.predefined.pa". Please replace $RepeatHMM-bin-folder$ with "/mnt/c/np/software/RepeatHMM/bin/" or any parent directory of where "repeatHMM.py" is.
Feel free to let me know if you still have any issue.
Great Thanks! I tried it and the programme is running now. However, there are many
Warning unknow CIGAR element
printed to the terminal. Is it normal or is there something else? I am running the programme in the RepeatHMM-bin folder this time.
Hi @6timings , great to know the program is running.
But the warning would significantly affect your results. I have updated the scripts to fix the issues. Please download the updated version of RepeatHMM. Thank you.
Thanks for the updates! The tool is running. However, when I run --repeatName all, there is no FMR1 repeats identified even though I can identify them with --repeatName` FMR1.
Also, there are many errors of this type when I run with --repeatName all even though I have indexed the bam and the genome file (with both samtools and bwa) [main_samview] region "X:148499606-148501692" specifies an unknown reference name. Continue anyway.
FYI, my command is:
python repeatHMM.py BAMinput --Onebamfile /mnt/c/np/amp/results/last/amp_combined_long.sorted.bam --repeatName all --hg hg38 --hgfile /mnt/c/np/reference/grch38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
`Please let me know if any more info is needed
Hi @6timings , it seems that the chromosome names are different: the bam is using "1", "2", ... "X" etc for chorosome, while the default might be "chr1", "chr2" and so on. If the error is not this case, I might need more information to find out what is the issue.
Hi @liuqianhn, thanks for the reply. I used samtools view| head
to inspect the bam file and found that the third column is actually chr1. Can I ask what info may be helpful for you, please?
Hi @6timings , could you please send me your log file to liuqianhn@gmail.com? Thank you.
Hi, I am facing the same issue error is coming like this sh: 1: trf: not found [E::bwa_idx_load_from_disk] fail to locate the index files [main_samview] region "chrX:146992569-146994628" specifies an unknown reference name. Continue anyway. [main_samview] region "X:146992569-146994628" specifies an unknown reference name. Continue anyway. The following options are used (included default): BWAMEMOptions ( -k8 -W8 -r7 ); CompRep (0); MatchInfo ([3, -2, -2, -15, -1]); MaxRep (4000); MinSup (5); Patternfile (None); RepeatTime (5); SeqTech (None); SplitAndReAlign (1); TRFOptions (2_7_4_80_10_100); Tolerate_mismatch (None); UserDefinedUniqID (None); align (align/); emissionm (None); hg (hg19); hgfile (/mnt/NGS/Human_Exome_hg19/hg19.fa); hmm_del_rate (0.02); hmm_insert_rate (0.12); hmm_sub_rate (0.02); isGapCorrection (1); minRepBWTSize (70); minTailSize (70); outlog (2); repeatFlankLength (30); repeatName (FMR1); specifiedRepeatInfo (///////); stsBasedFolder (reference_sts/); transitionm (None);
Onebamfile (Prajwal_Wagh_aligned.sorted.bam);
SepbamfileTemp (None);
align (align/);
analysis_file_id (_GapCorrection1_FlankLength30_SplitAndReAlign1_2_7_4_80_10_100_hg19_comp_I0.120_D0.020_S0.020);
bamfile (Prajwal_Wagh_aligned.sorted.bam);
unique_file_id (.gmm_GapCorrection1_FlankLength30_SplitAndReAlign1_2_7_4_80_10_100_hg19_comp_I0.120_D0.020_S0.020);
p2sp end---running time0 mem74
p2sp ['fmr1', 20.0, [0, 0], 'allocr:', 0, 15]
p2sp
FMR1 0 0;
I am using the following command python2 repeatHMM.py BAMinput --Onebamfile Prajwal_Wagh_aligned.sorted.bam --hg hg19 --hgfile /mnt/NGS/Human_Exome_hg19/hg19.fa --repeatName FMR1;
not getting what is the problem, I need to run the program urgently for a project.
Hi @amrita1983, there might be two issues in your case: (1) trf might not be available, and you might need to install the TRF tool (Tandem Repeat Finder (see https://bioconda.github.io/recipes/trf/README.html)
), and (2) the reference genome "hg19.fa” and the "bam" file might not be indexed properly with bwa and samtools. Please correct me if I am wrong.
Hello, that issue is resolved thanks, can you please help me to know whether this tool is able to detect FMR1 repeats, which is actually long repeat and I have WGS data for a patient with FMR1 repeats using RepeatHMM the count is coming as 30 which quite unlikely for the patient.
Hi @amrita1983, RepeatHMM can detect FMR1 repeats but might rely on the long reads which fully cover FMR1 repeat. If no reads cover flanking regions of the repeat, the long repeats might be missing (We will improve this part, but the improvement is not available yet). Sorry for missing your question.
hello ,when using the following command: /lustre/huangyf/software/Miniconda3/envs/repeathmmenv/bin/python "/lustre/huangyf/software/RepeatHMM/bin/repeatHMM.py" FASTQinput --fastq "/lustre/huangyf/ont.fastq.data/20190329-BNP0832-P4-A1.pass.fastq" --hgfile "/lustre/huangyf/genome/hg38/hg38.fa" --repeatName all --Patternfile "/lustre/huangyf/software/RepeatHMM/bin/reference_sts/hg38/hg38.predefined.pa"
i find erro information as following: ('Error no information for 0\nError no information for 1\nError no information for 2\nError no information for 3\nError no information for 4\n', ['', '', '', '', '', '', ''], 'all')
could you help me
@huangyuanf Since it is a same issue, please I will reply it at issue #40.
I have seen the question,but there is differnt.i installed repeatHMM under "/lustre/huangyf/software/RepeatHMM-2.0.3/",then fastq file is under "/lustre/huangyf/ont.fastq.data/20190329-BNP0832-P4-A1.pass.fastq", the reference under "/lustre/huangyf/genome/hg38/hg38.fa", Patternfile located at "/lustre/huangyf/software/RepeatHMM-2.0.3/bin/reference_sts/hg38/hg38.predefined.pa" ,they were full path.
but when i entered this command in the terminal: python "/lustre/huangyf/software/RepeatHMM-2.0.3/bin/repeatHMM.py" FASTQinput --fastq "/lustre/huangyf/ont.fastq.data/20190329-BNP0832-P4-A1.pass.fastq" --hg hg38 --hgfile "/lustre/huangyf/genome/hg38/hg38.fa" --repeatName all --Patternfile "/lustre/huangyf/software/RepeatHMM-2.0.3/bin/reference_sts/hg38/hg38.predefined.pa"
i got a erro information: ('Error no information for 0\nError no information for 1\nError no information for 2\nError no information for 3\nError no information for 4\n', ['', '', '', '', '', '', ''], 'all')
------------------ 原始邮件 ------------------ 发件人: "WGLab/RepeatHMM" @.>; 发送时间: 2021年4月10日(星期六) 晚上9:14 @.>; @.**@.>; 主题: Re: [WGLab/RepeatHMM] Error no information (#17)
@huangyuanf Since it is a same issue, please I will reply it at issue #40.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@huangyuanf It seems that for FASTQinput
, all
is not supported. Could you please replace all
after --repeatName
with a specific name in /lustre/huangyf/software/RepeatHMM-2.0.3/bin/reference_sts/hg38/hg38.predefined.pa and see whether you will have the same error?
i don't kownwhy?how to do it?please! thanks!
------------------ 原始邮件 ------------------ 发件人: "WGLab/RepeatHMM" @.>; 发送时间: 2021年4月10日(星期六) 晚上10:00 @.>; @.**@.>; 主题: Re: [WGLab/RepeatHMM] Error no information (#17)
@huangyuanf It seems that for FASTQinput, all is not supported. Could you please replace all after --repeatName with a specific name in /lustre/huangyf/software/RepeatHMM-2.0.3/bin/reference_sts/hg38/hg38.predefined.pa and see whether you will have the same error?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
i try bam file ,having the same question
------------------ 原始邮件 ------------------ 发件人: "WGLab/RepeatHMM" @.>; 发送时间: 2021年4月10日(星期六) 晚上10:00 @.>; @.**@.>; 主题: Re: [WGLab/RepeatHMM] Error no information (#17)
@huangyuanf It seems that for FASTQinput, all is not supported. Could you please replace all after --repeatName with a specific name in /lustre/huangyf/software/RepeatHMM-2.0.3/bin/reference_sts/hg38/hg38.predefined.pa and see whether you will have the same error?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@huangyuanf sorry for the late reply. I tried to re-produce the error, but I cannot not. Could you please show what is in the file /lustre/huangyf/software/RepeatHMM-2.0.3/bin/reference_sts/hg38/hg38.predefined.pa? Thanks.
Afer I keyed in the following command:
python2.7 /mnt/c/np/software/RepeatHMM/bin/repeatHMM.py BAMinput --Onebamfile amp_combined_long.sorted.bam --repeat Name FMR1 -- hgfile /mnt/c/np/reference/grch38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
I got the following error:
No pa file reference_sts//hg38/hg38.predefined.pa The following options are used (included default): BWAMEMOptions ( -k8 -W8 -r7 ); CompRep (0); MatchInfo ([3, -2, -2, -15, -1]); MaxRep (4000); MinSup (5); Patternfile (None); RepeatTime (5); SeqTech (None); SplitAndReAlign (1); TRFOptions (2_7_4_80_10_100); Tolerate_mismatch (None); UserDefinedUniqID (None); align (align/); emissionm (None); hg (hg38); hgfile (/mnt/c/np/reference/grch38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna); hmm_del_rate (0.02); hmm_insert_rate (0.12); hmm_sub_rate (0.02); isGapCorrection (1); minRepBWTSize (70); minTailSize (70); outlog (2); repeatFlankLength (30); repeatName (FMR1); specifiedRepeatInfo (///////); stsBasedFolder (reference_sts/); transitionm (None);
('Error no information for 0\nError no information for 1\nError no information for 2\nError no information for 3\nError no information for 4\n', ['', '', '', '', '', '', ''], 'fmr1')
Can I ask how I can resolve the error, please?