Open mujiezhang opened 3 years ago
Do you have mpirun, ffindex_apply_mpi and hhconsensus in your PATH? Check output of these commands:
which mpirun
which ffindex_apply_mpi
which hhconsensus
I install the mpirun just now. and I do not find the ffindex_apply_mpi. I install hhsuite through conda, So this problem occur sometimes if hhusite installed through conda? How can I get ffindex_apply_mpi?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 19:44 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
Do you have mpirun, ffindex_apply_mpi and hhconsensus in your PATH? Check output of these commands: which mpirun which ffindex_apply_mpi which hhconsensus — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
You should use full path to ffindex_apply_mpi binary. I don't know where might that be in conda...
By the way - when you are using -np 1 option in mpirun consider skipping mpi at all and just go for:
ffindex_apply 227_msa.ff{data,index} -i 227_a3m_wo_ss.ffindex -d 227_a3m_wo_ss.ffdata -- hhconsensus -M 50 -maxres 65535 -i stdin -oa3m stdout -v 0'
it does the same
Oh, thank you very much! You are so nice! The problem is solved. But I have another small question. I have several groups of proteins, and I want to find out whether a group is similar to another group. Now, I make local hhsuite database of these protein groups and do hhsearch using protein groups one by one against the database,.Am I right?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 19:55 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
You should use full path to ffindex_apply_mpi binary. I don't know where might that be in conda... By the way - when you are using -np 1 option in mpirun consider skipping mpi at all and just go for: ffindex_apply 227_msa.ff{data,index} -i 227_a3m_wo_ss.ffindex -d 227_a3m_wo_ss.ffdata -- hhconsensus -M 50 -maxres 65535 -i stdin -oa3m stdout -v 0' it does the same — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Suppose you have files: querydb_hhm.ffdata querydb_hhm.ffindex dbToBeSearched_hhm.ffdata dbToBeSearched_hhm.ffindex
You can run:
ffindex_apply querydb_hhm.ffdata querydb_hhm.ffindex -i mappings.ffindex -d mappings.ffdata -- hhsearch -i stdin -o stdout -d dbToBeSearched
and this will generate 3rd database "mappings" with the results
Sorry for my ignorance… I run the command ‘ffindex_apply 227_msa.ff{data,index} -i 227_a3m_wo_ss.ffindex -d 227_a3m_wo_ss.ffdata -- hhconsensus -M 50 -maxres 65535 -i stdin -oa3m stdout -v 0’ and it is right. Then I run ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 27_a3m.ffdata -- addss.pl -v 0 stdin stdout’ and it is right. But when I want to generate the hhm file using command ‘ffindex_apply 227_a3m.ff{data,index} -i 227_hhm.ffindex -d 227_hhm.ffdata -- hhmake -i stdin -o stdout -v 0’ ,I got lots of errors like ‘97.txt_muscle.msa 224 1 286 4
20:28:58.692 ERROR: Error in /opt/conda/conda-bld/hhsuite_1598863433284/work/src/hhfunc.cpp:16: ReadQueryFile:
20:28:58.692 ERROR: stdin is empty!
98.txt_muscle.msa 225 1 256 4
20:28:58.983 ERROR: Error in /opt/conda/conda-bld/hhsuite_1598863433284/work/src/hhfunc.cpp:16: ReadQueryFile:
20:28:58.983 ERROR: stdin is empty!’ So, I did not have the hhm.ffdata and hhm.ffindex files…
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 20:11 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
Suppose you have files: querydb_hhm.ffdata querydb_hhm.ffindex dbToBeSearched_hhm.ffdata dbToBeSearched_hhm.ffindex You can run: ffindex_apply querydb_hhm.ffdata querydb_hhm.ffindex -i mappings.ffindex -d mappings.ffdata -- hhsearch -i stdin -o stdout -d dbToBeSearched and this will generate 3rd database "mappings" with the results — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
maybe something with 227_a3m.ff{data,index} files? You can see into the 227_a3m.ffdata file and check whether it contains anything. Another test is to run it without -v 0 option and see upon which db element it crashes.
I have checked the 227_a3m.ffdata file, and it seems like a wrong file which contain ‘^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@’ But when I run the command ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 27_a3m.ffdata -- addss.pl -v 0 stdin stdout’, there is no wrong information…
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 20:48 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
maybe something with 227_a3m.ff{data,index} files? You can see into the 227_a3m.ffdata file and check whether it contains anything. Another test is to run it without -v 0 option and see upon which db element it crashes. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
run addss.pl without -v 0
I run ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 227_a3m.ffdata -- addss.pl stdin stdout’ and ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 227_a3m.ffdata -- addss.pl’ and they generated the same results as ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 227_a3m.ffdata -- addss.pl -v 0 stdin stdout’. The formal space usage of a3m.ffdata file is usually larger than the msa.ffdata.But the 227_a3m.ffdata is only 227bytes. I do not know what wrong with it.
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 21:02 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
run addss.pl without -v 0 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
did you setup the paths in addss.pl script? it requires paths to psipred as far as I recall...
pon., 24 maj 2021 o 15:11 mujiezhang @.***> napisał(a):
I run ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 227_a3m.ffdata -- addss.pl stdin stdout’ and ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 227_a3m.ffdata -- addss.pl’ and they generated the same results as ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 227_a3m.ffdata -- addss.pl -v 0 stdin stdout’. The formal space usage of a3m.ffdata file is usually larger than the msa.ffdata.But the 227_a3m.ffdata is only 227bytes. I do not know what wrong with it.
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 21:02 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
run addss.pl without -v 0 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/soedinglab/hh-suite/issues/268#issuecomment-847032702, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMI2WHU63XM2Z2YPAZ6TTPJGAVANCNFSM45NBQOKQ .
Maybe I can try to install the hhsuite through source. Anyway, thanks a lot and you are so patient with me. Thanks again!
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 21:22 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
did you setup the paths in addss.pl script? it requires paths to psipred as far as I recall...
pon., 24 maj 2021 o 15:11 mujiezhang @.***> napisał(a):
I run ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 227_a3m.ffdata -- addss.pl stdin stdout’ and ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 227_a3m.ffdata -- addss.pl’ and they generated the same results as ‘ffindex_apply 227_a3m_wo_ss.ff{data,index} -i 227_a3m.ffindex -d 227_a3m.ffdata -- addss.pl -v 0 stdin stdout’. The formal space usage of a3m.ffdata file is usually larger than the msa.ffdata.But the 227_a3m.ffdata is only 227bytes. I do not know what wrong with it.
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 21:02 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
run addss.pl without -v 0 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/soedinglab/hh-suite/issues/268#issuecomment-847032702, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMI2WHU63XM2Z2YPAZ6TTPJGAVANCNFSM45NBQOKQ .
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
It won't solve your problem - you have to configure psipred anyway - hhsuite uses that and it is an external tool to be connected to hhsuite.
Oh! But I do not know how to configure psipred. Should I download it throuh conda ? 发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 21:31 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
It won't solve your problem - you have to configure psipred anyway - hhsuite uses that and it is an external tool to be connected to hhsuite. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
First of all, you can easily skip ss prediction and go with non ss a3m. According to hhsuite documentation sensitivity increase is little unless you're going to play with parameters more deeply.
If you want to go for ss prediction anyway, you should install psipred or compile it from source, and edit HHPaths.pm in hhsuite scripts subdirectory to work with your local psipred installation.
Thank you very much! Your advices are very useful! I am trying.
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 21:56 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
First of all, you can easily skip ss prediction and go with non ss a3m. According to hhsuite documentation sensitivity increase is little unless you're going to play with parameters more deeply. If you want to go for ss prediction anyway, you should install psipred or compile it from source, and edit HHPaths.pm in hhsuite scripts subdirectory to work with your local psipred installation. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Another stupid question…how to skip ss prediction……
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 21:56 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
First of all, you can easily skip ss prediction and go with non ss a3m. According to hhsuite documentation sensitivity increase is little unless you're going to play with parameters more deeply. If you want to go for ss prediction anyway, you should install psipred or compile it from source, and edit HHPaths.pm in hhsuite scripts subdirectory to work with your local psipred installation. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
To skip ss just rename a3m no_ss files into a3m ant that's all - build hmm profiles on them.
Thanks for your useful advices! And I got the final results. I have two more questions. The result file is something like
‘Query lcl|MH719189.1_prot_AYD80303.1_44 [locus_tag=Fc02_44] [protein=virion structural protein] [protein_id=AYD80303.1] [location=29314..30270] [gbkey=CDS] Match_columns 318 No_of_seqs 1 out of 4 Neff 1 Searched_HMMs 227 Date Tue May 25 09:46:24 2021 Command hhsearch -i stdin -o stdout -d 227 -cov 50 -qid 90
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 lcl|MH719189.1_prot_AYD80303.1 100.0 2E-196 9E-199 1295.5 0.0 318 1-318 1-318 (318) 2 lcl|NC_020198.1_prot_YP_007392 13.5 0.47 0.0021 24.6 0.0 25 1-25 30-54 (74) 3 lcl|JQ067085.2_prot_AII21881.1 10.9 0.64 0.0028 24.0 0.0 12 199-210 16-27 (80) 4 lcl|JX495042.1_prot_AFR52235.1 7.7 1 0.0045 23.9 0.0 18 274-291 69-86 (102) 5 lcl|KM233689.1_prot_AIM40317.1 2.9 3.5 0.015 20.2 0.0 39 6-44 12-61 (87) 6 lcl|NC_005882.1_prot_YP_024699 2.5 4.2 0.019 20.5 0.0 11 200-210 72-82 (110) 7 lcl|NC_020198.1_prot_YP_007392 1.7 6.5 0.029 17.9 0.0 13 7-19 51-63 (67) 8 lcl|NC_000929.1_prot_NP_050615 1.1 11 0.047 19.1 0.0 16 37-52 91-106 (176) 9 lcl|NC_028766.1_prot_YP_009196 1.1 11 0.049 18.8 0.0 13 125-137 142-154 (158) 10 lcl|NC_005882.1_prot_YP_024720 1.0 12 0.051 20.7 0.0 20 195-214 51-70 (401)’ The database is made of 227 msa files, and I want to know whether one msa is similar to another. But this result only tell me which squence is similar to another squence. How should I understand this result?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 22:11 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
To skip ss just rename a3m no_ss files into a3m ant that's all - build hmm profiles on them. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Assuming you are running hhsearch using ffindex_apply you get hhsearch mappings between all profiles (hmm/a3m) in your query database to profiles in the target database. You pasted a fragment of the result showing how MH719189.1_prot_AYD80303.1_44 compares to entries in the target database. As you see it is similar to itself and barely similar to the remaining objects in the target database. HHsearch reports minimum 10 hits even if they don't meet reliability thresholds criteria (and more hits if it finds more similar objects in the database).
What exactly do you want to do? Compare each sequence with each? You can assume that the remaining 217 sequences in the target database are not similar to the query.
wt., 25 maj 2021 o 04:13 mujiezhang @.***> napisał(a):
Thanks for your useful advices! And I got the final results. I have two more questions. The result file is something like ‘Query lcl|MH719189.1_prot_AYD80303.1_44 [locus_tag=Fc02_44] [protein=virion structural protein] [protein_id=AYD80303.1] [location=29314..30270] [gbkey=CDS] Match_columns 318 No_of_seqs 1 out of 4 Neff 1 Searched_HMMs 227 Date Tue May 25 09:46:24 2021 Command hhsearch -i stdin -o stdout -d 227 -cov 50 -qid 90
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 lcl|MH719189.1_prot_AYD80303.1 100.0 2E-196 9E-199 1295.5 0.0 318 1-318 1-318 (318) 2 lcl|NC_020198.1_prot_YP_007392 13.5 0.47 0.0021 24.6 0.0 25 1-25 30-54 (74) 3 lcl|JQ067085.2_prot_AII21881.1 10.9 0.64 0.0028 24.0 0.0 12 199-210 16-27 (80) 4 lcl|JX495042.1_prot_AFR52235.1 7.7 1 0.0045 23.9 0.0 18 274-291 69-86 (102) 5 lcl|KM233689.1_prot_AIM40317.1 2.9 3.5 0.015 20.2 0.0 39 6-44 12-61 (87) 6 lcl|NC_005882.1_prot_YP_024699 2.5 4.2 0.019 20.5 0.0 11 200-210 72-82 (110) 7 lcl|NC_020198.1_prot_YP_007392 1.7 6.5 0.029 17.9 0.0 13 7-19 51-63 (67) 8 lcl|NC_000929.1_prot_NP_050615 1.1 11 0.047 19.1 0.0 16 37-52 91-106 (176) 9 lcl|NC_028766.1_prot_YP_009196 1.1 11 0.049 18.8 0.0 13 125-137 142-154 (158) 10 lcl|NC_005882.1_prot_YP_024720 1.0 12 0.051 20.7 0.0 20 195-214 51-70 (401)’ The database is made of 227 msa files, and I want to know whether one msa is similar to another. But this result only tell me which squence is similar to another squence. How should I understand this result?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 22:11 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
To skip ss just rename a3m no_ss files into a3m ant that's all - build hmm profiles on them. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/soedinglab/hh-suite/issues/268#issuecomment-847477997, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMI4JGRRGUOAYJQPBKHDTPMBU5ANCNFSM45NBQOKQ .
I have 227 clusters of proteins. What I exactly want to do is to ensure which protein cluster are similar to another. What I have done are that I made alignment of every protein clusters and used them to make the hhsearch database as you told me before and the documents online. Then I want to compare the 227 clusters to themselves and I run the command ‘ffindex_apply 227_hhm.ffdata 227_hhm.ffindex -i mappings.ffindex -d mappings.ffdata -- hhsearch -i stdin -o stdout -d 227’
And I got the result file-mappings.ffdata which contains the hhsearch results. But as you can seen in the mappings.ffdata, I just could not understand the result clearly. Does the query represent the cluster it belongs to? For example, if the query sequence A belongs to cluster1, it has a very good hit of squences B belongs to cluster2, So can I say that the cluster1 are similar to cluster 2?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月26日 15:30 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
Assuming you are running hhsearch using ffindex_apply you get hhsearch mappings between all profiles (hmm/a3m) in your query database to profiles in the target database. You pasted a fragment of the result showing how MH719189.1_prot_AYD80303.1_44 compares to entries in the target database. As you see it is similar to itself and barely similar to the remaining objects in the target database. HHsearch reports minimum 10 hits even if they don't meet reliability thresholds criteria (and more hits if it finds more similar objects in the database).
What exactly do you want to do? Compare each sequence with each? You can assume that the remaining 217 sequences in the target database are not similar to the query.
wt., 25 maj 2021 o 04:13 mujiezhang @.***> napisał(a):
Thanks for your useful advices! And I got the final results. I have two more questions. The result file is something like ‘Query lcl|MH719189.1_prot_AYD80303.1_44 [locus_tag=Fc02_44] [protein=virion structural protein] [protein_id=AYD80303.1] [location=29314..30270] [gbkey=CDS] Match_columns 318 No_of_seqs 1 out of 4 Neff 1 Searched_HMMs 227 Date Tue May 25 09:46:24 2021 Command hhsearch -i stdin -o stdout -d 227 -cov 50 -qid 90
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 lcl|MH719189.1_prot_AYD80303.1 100.0 2E-196 9E-199 1295.5 0.0 318 1-318 1-318 (318) 2 lcl|NC_020198.1_prot_YP_007392 13.5 0.47 0.0021 24.6 0.0 25 1-25 30-54 (74) 3 lcl|JQ067085.2_prot_AII21881.1 10.9 0.64 0.0028 24.0 0.0 12 199-210 16-27 (80) 4 lcl|JX495042.1_prot_AFR52235.1 7.7 1 0.0045 23.9 0.0 18 274-291 69-86 (102) 5 lcl|KM233689.1_prot_AIM40317.1 2.9 3.5 0.015 20.2 0.0 39 6-44 12-61 (87) 6 lcl|NC_005882.1_prot_YP_024699 2.5 4.2 0.019 20.5 0.0 11 200-210 72-82 (110) 7 lcl|NC_020198.1_prot_YP_007392 1.7 6.5 0.029 17.9 0.0 13 7-19 51-63 (67) 8 lcl|NC_000929.1_prot_NP_050615 1.1 11 0.047 19.1 0.0 16 37-52 91-106 (176) 9 lcl|NC_028766.1_prot_YP_009196 1.1 11 0.049 18.8 0.0 13 125-137 142-154 (158) 10 lcl|NC_005882.1_prot_YP_024720 1.0 12 0.051 20.7 0.0 20 195-214 51-70 (401)’ The database is made of 227 msa files, and I want to know whether one msa is similar to another. But this result only tell me which squence is similar to another squence. How should I understand this result?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 22:11 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
To skip ss just rename a3m no_ss files into a3m ant that's all - build hmm profiles on them. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/soedinglab/hh-suite/issues/268#issuecomment-847477997, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMI4JGRRGUOAYJQPBKHDTPMBU5ANCNFSM45NBQOKQ .
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
So your interpretation is that MH719189.1_prot_AYD80303.1_44 doesn't cluster with any other msa in the database.
śr., 26 maj 2021 o 09:52 mujiezhang @.***> napisał(a):
I have 227 clusters of proteins. What I exactly want to do is to ensure which protein cluster are similar to another. What I have done are that I made alignment of every protein clusters and used them to make the hhsearch database as you told me before and the documents online. Then I want to compare the 227 clusters to themselves and I run the command ‘ffindex_apply 227_hhm.ffdata 227_hhm.ffindex -i mappings.ffindex -d mappings.ffdata -- hhsearch -i stdin -o stdout -d 227’
And I got the result file-mappings.ffdata which contains the hhsearch results. But as you can seen in the mappings.ffdata, I just could not understand the result clearly. Does the query represent the cluster it belongs to? For example, if the query sequence A belongs to cluster1, it has a very good hit of squences B belongs to cluster2, So can I say that the cluster1 are similar to cluster 2?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月26日 15:30 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
Assuming you are running hhsearch using ffindex_apply you get hhsearch mappings between all profiles (hmm/a3m) in your query database to profiles in the target database. You pasted a fragment of the result showing how MH719189.1_prot_AYD80303.1_44 compares to entries in the target database. As you see it is similar to itself and barely similar to the remaining objects in the target database. HHsearch reports minimum 10 hits even if they don't meet reliability thresholds criteria (and more hits if it finds more similar objects in the database).
What exactly do you want to do? Compare each sequence with each? You can assume that the remaining 217 sequences in the target database are not similar to the query.
wt., 25 maj 2021 o 04:13 mujiezhang @.***> napisał(a):
Thanks for your useful advices! And I got the final results. I have two more questions. The result file is something like ‘Query lcl|MH719189.1_prot_AYD80303.1_44 [locus_tag=Fc02_44] [protein=virion structural protein] [protein_id=AYD80303.1] [location=29314..30270] [gbkey=CDS] Match_columns 318 No_of_seqs 1 out of 4 Neff 1 Searched_HMMs 227 Date Tue May 25 09:46:24 2021 Command hhsearch -i stdin -o stdout -d 227 -cov 50 -qid 90
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 lcl|MH719189.1_prot_AYD80303.1 100.0 2E-196 9E-199 1295.5 0.0 318 1-318 1-318 (318) 2 lcl|NC_020198.1_prot_YP_007392 13.5 0.47 0.0021 24.6 0.0 25 1-25 30-54 (74) 3 lcl|JQ067085.2_prot_AII21881.1 10.9 0.64 0.0028 24.0 0.0 12 199-210 16-27 (80) 4 lcl|JX495042.1_prot_AFR52235.1 7.7 1 0.0045 23.9 0.0 18 274-291 69-86 (102) 5 lcl|KM233689.1_prot_AIM40317.1 2.9 3.5 0.015 20.2 0.0 39 6-44 12-61 (87) 6 lcl|NC_005882.1_prot_YP_024699 2.5 4.2 0.019 20.5 0.0 11 200-210 72-82 (110) 7 lcl|NC_020198.1_prot_YP_007392 1.7 6.5 0.029 17.9 0.0 13 7-19 51-63 (67) 8 lcl|NC_000929.1_prot_NP_050615 1.1 11 0.047 19.1 0.0 16 37-52 91-106 (176) 9 lcl|NC_028766.1_prot_YP_009196 1.1 11 0.049 18.8 0.0 13 125-137 142-154 (158) 10 lcl|NC_005882.1_prot_YP_024720 1.0 12 0.051 20.7 0.0 20 195-214 51-70 (401)’ The database is made of 227 msa files, and I want to know whether one msa is similar to another. But this result only tell me which squence is similar to another squence. How should I understand this result?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 22:11 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
To skip ss just rename a3m no_ss files into a3m ant that's all - build hmm profiles on them. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/soedinglab/hh-suite/issues/268#issuecomment-847477997>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AD2CMI4JGRRGUOAYJQPBKHDTPMBU5ANCNFSM45NBQOKQ
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/soedinglab/hh-suite/issues/268#issuecomment-848550908, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMI4JJ4K7M7DTEX5G4X3TPSSCXANCNFSM45NBQOKQ .
Maybe I should show another picture to you. Now as you can see in the picture, The protein lcl | NC_019455.1_prot_YP_007002910.1_2 belonging to protein cluster A have two significant hit with prob>90, one is lcl | NC_018274.1_prot_YP_006560 belonging to protein cluster B and another is lcl | NC_005882.1_prot_YP_024689 belonging to protein cluster C. So I certainly know the lcl | NC_019455.1_prot_YP_007002910.1_2 is similar to the two hit. But what I am not sure is that whether cluster A are similar to cluster B and C. Can the query sequence represent the cluster it belongs to? 发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月26日 15:54 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
So your interpretation is that MH719189.1_prot_AYD80303.1_44 doesn't cluster with any other msa in the database.
śr., 26 maj 2021 o 09:52 mujiezhang @.***> napisał(a):
I have 227 clusters of proteins. What I exactly want to do is to ensure which protein cluster are similar to another. What I have done are that I made alignment of every protein clusters and used them to make the hhsearch database as you told me before and the documents online. Then I want to compare the 227 clusters to themselves and I run the command ‘ffindex_apply 227_hhm.ffdata 227_hhm.ffindex -i mappings.ffindex -d mappings.ffdata -- hhsearch -i stdin -o stdout -d 227’
And I got the result file-mappings.ffdata which contains the hhsearch results. But as you can seen in the mappings.ffdata, I just could not understand the result clearly. Does the query represent the cluster it belongs to? For example, if the query sequence A belongs to cluster1, it has a very good hit of squences B belongs to cluster2, So can I say that the cluster1 are similar to cluster 2?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月26日 15:30 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
Assuming you are running hhsearch using ffindex_apply you get hhsearch mappings between all profiles (hmm/a3m) in your query database to profiles in the target database. You pasted a fragment of the result showing how MH719189.1_prot_AYD80303.1_44 compares to entries in the target database. As you see it is similar to itself and barely similar to the remaining objects in the target database. HHsearch reports minimum 10 hits even if they don't meet reliability thresholds criteria (and more hits if it finds more similar objects in the database).
What exactly do you want to do? Compare each sequence with each? You can assume that the remaining 217 sequences in the target database are not similar to the query.
wt., 25 maj 2021 o 04:13 mujiezhang @.***> napisał(a):
Thanks for your useful advices! And I got the final results. I have two more questions. The result file is something like ‘Query lcl|MH719189.1_prot_AYD80303.1_44 [locus_tag=Fc02_44] [protein=virion structural protein] [protein_id=AYD80303.1] [location=29314..30270] [gbkey=CDS] Match_columns 318 No_of_seqs 1 out of 4 Neff 1 Searched_HMMs 227 Date Tue May 25 09:46:24 2021 Command hhsearch -i stdin -o stdout -d 227 -cov 50 -qid 90
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 lcl|MH719189.1_prot_AYD80303.1 100.0 2E-196 9E-199 1295.5 0.0 318 1-318 1-318 (318) 2 lcl|NC_020198.1_prot_YP_007392 13.5 0.47 0.0021 24.6 0.0 25 1-25 30-54 (74) 3 lcl|JQ067085.2_prot_AII21881.1 10.9 0.64 0.0028 24.0 0.0 12 199-210 16-27 (80) 4 lcl|JX495042.1_prot_AFR52235.1 7.7 1 0.0045 23.9 0.0 18 274-291 69-86 (102) 5 lcl|KM233689.1_prot_AIM40317.1 2.9 3.5 0.015 20.2 0.0 39 6-44 12-61 (87) 6 lcl|NC_005882.1_prot_YP_024699 2.5 4.2 0.019 20.5 0.0 11 200-210 72-82 (110) 7 lcl|NC_020198.1_prot_YP_007392 1.7 6.5 0.029 17.9 0.0 13 7-19 51-63 (67) 8 lcl|NC_000929.1_prot_NP_050615 1.1 11 0.047 19.1 0.0 16 37-52 91-106 (176) 9 lcl|NC_028766.1_prot_YP_009196 1.1 11 0.049 18.8 0.0 13 125-137 142-154 (158) 10 lcl|NC_005882.1_prot_YP_024720 1.0 12 0.051 20.7 0.0 20 195-214 51-70 (401)’ The database is made of 227 msa files, and I want to know whether one msa is similar to another. But this result only tell me which squence is similar to another squence. How should I understand this result?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 22:11 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
To skip ss just rename a3m no_ss files into a3m ant that's all - build hmm profiles on them. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/soedinglab/hh-suite/issues/268#issuecomment-847477997>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AD2CMI4JGRRGUOAYJQPBKHDTPMBU5ANCNFSM45NBQOKQ
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/soedinglab/hh-suite/issues/268#issuecomment-848550908, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMI4JJ4K7M7DTEX5G4X3TPSSCXANCNFSM45NBQOKQ .
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Didn't get any picture. Anyway - we can switch to regular e-mails with the discussion since the hhsuite problem was solved. Feel free to catch me on kamil dot steczkiewicz at gmail.com.
śr., 26 maj 2021 o 10:06 mujiezhang @.***> napisał(a):
Maybe I should show another picture to you. Now as you can see in the picture, The protein lcl | NC_019455.1_prot_YP_007002910.1_2 belonging to protein cluster A have two significant hit with prob>90, one is lcl | NC_018274.1_prot_YP_006560 belonging to protein cluster B and another is lcl | NC_005882.1_prot_YP_024689 belonging to protein cluster C. So I certainly know the lcl | NC_019455.1_prot_YP_007002910.1_2 is similar to the two hit. But what I am not sure is that whether cluster A are similar to cluster B and C. Can the query sequence represent the cluster it belongs to? 发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月26日 15:54 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
So your interpretation is that MH719189.1_prot_AYD80303.1_44 doesn't cluster with any other msa in the database.
śr., 26 maj 2021 o 09:52 mujiezhang @.***> napisał(a):
I have 227 clusters of proteins. What I exactly want to do is to ensure which protein cluster are similar to another. What I have done are that I made alignment of every protein clusters and used them to make the hhsearch database as you told me before and the documents online. Then I want to compare the 227 clusters to themselves and I run the command ‘ffindex_apply 227_hhm.ffdata 227_hhm.ffindex -i mappings.ffindex -d mappings.ffdata -- hhsearch -i stdin -o stdout -d 227’
And I got the result file-mappings.ffdata which contains the hhsearch results. But as you can seen in the mappings.ffdata, I just could not understand the result clearly. Does the query represent the cluster it belongs to? For example, if the query sequence A belongs to cluster1, it has a very good hit of squences B belongs to cluster2, So can I say that the cluster1 are similar to cluster 2?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月26日 15:30 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
Assuming you are running hhsearch using ffindex_apply you get hhsearch mappings between all profiles (hmm/a3m) in your query database to profiles in the target database. You pasted a fragment of the result showing how MH719189.1_prot_AYD80303.1_44 compares to entries in the target database. As you see it is similar to itself and barely similar to the remaining objects in the target database. HHsearch reports minimum 10 hits even if they don't meet reliability thresholds criteria (and more hits if it finds more similar objects in the database).
What exactly do you want to do? Compare each sequence with each? You can assume that the remaining 217 sequences in the target database are not similar to the query.
wt., 25 maj 2021 o 04:13 mujiezhang @.***> napisał(a):
Thanks for your useful advices! And I got the final results. I have two more questions. The result file is something like ‘Query lcl|MH719189.1_prot_AYD80303.1_44 [locus_tag=Fc02_44] [protein=virion structural protein] [protein_id=AYD80303.1] [location=29314..30270] [gbkey=CDS] Match_columns 318 No_of_seqs 1 out of 4 Neff 1 Searched_HMMs 227 Date Tue May 25 09:46:24 2021 Command hhsearch -i stdin -o stdout -d 227 -cov 50 -qid 90
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM 1 lcl|MH719189.1_prot_AYD80303.1 100.0 2E-196 9E-199 1295.5 0.0 318 1-318 1-318 (318) 2 lcl|NC_020198.1_prot_YP_007392 13.5 0.47 0.0021 24.6 0.0 25 1-25 30-54 (74) 3 lcl|JQ067085.2_prot_AII21881.1 10.9 0.64 0.0028 24.0 0.0 12 199-210 16-27 (80) 4 lcl|JX495042.1_prot_AFR52235.1 7.7 1 0.0045 23.9 0.0 18 274-291 69-86 (102) 5 lcl|KM233689.1_prot_AIM40317.1 2.9 3.5 0.015 20.2 0.0 39 6-44 12-61 (87) 6 lcl|NC_005882.1_prot_YP_024699 2.5 4.2 0.019 20.5 0.0 11 200-210 72-82 (110) 7 lcl|NC_020198.1_prot_YP_007392 1.7 6.5 0.029 17.9 0.0 13 7-19 51-63 (67) 8 lcl|NC_000929.1_prot_NP_050615 1.1 11 0.047 19.1 0.0 16 37-52 91-106 (176) 9 lcl|NC_028766.1_prot_YP_009196 1.1 11 0.049 18.8 0.0 13 125-137 142-154 (158) 10 lcl|NC_005882.1_prot_YP_024720 1.0 12 0.051 20.7 0.0 20 195-214 51-70 (401)’ The database is made of 227 msa files, and I want to know whether one msa is similar to another. But this result only tell me which squence is similar to another squence. How should I understand this result?
发送自 Windows 10 版邮件应用
发件人: Kamil 发送时间: 2021年5月24日 22:11 收件人: soedinglab/hh-suite 抄送: mujiezhang; Author 主题: Re: [soedinglab/hh-suite] issue about building local databse (#268)
To skip ss just rename a3m no_ss files into a3m ant that's all - build hmm profiles on them. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/soedinglab/hh-suite/issues/268#issuecomment-847477997 , or unsubscribe <
https://github.com/notifications/unsubscribe-auth/AD2CMI4JGRRGUOAYJQPBKHDTPMBU5ANCNFSM45NBQOKQ
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/soedinglab/hh-suite/issues/268#issuecomment-848550908>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AD2CMI4JJ4K7M7DTEX5G4X3TPSSCXANCNFSM45NBQOKQ
.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/soedinglab/hh-suite/issues/268#issuecomment-848559829, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMIYKRQ6ZKVIQ52SEVC3TPSTXRANCNFSM45NBQOKQ .
Seems that the file is missing? Is it in the directory from which you're running the script? Are you running it locally on the same machine? Why there's error from mpirun? How exactly did you run this?
śr., 6 kwi 2022, 11:22 użytkownik chao @.***> napisał:
When I enter the following command: 'ffindex_apply cluster1091_a3m_wo_ss.ff{data,index} -i cluster1091_a3m.ffindex -d cluster1091_a3m.ffdata -- addss.pl stdin stdout /big/martin/hh-suite/lib/ffindex/src/ffindex_apply_mpi.c:341 ffindex_apply: cluster1091_a3m_wo_ss.ffdata: No such file or directory' there is such an error, how should I solve it, thank you
— Reply to this email directly, view it on GitHub https://github.com/soedinglab/hh-suite/issues/268#issuecomment-1090049770, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMIY6YOMDQY3C6LCGY6LVDVJVHANCNFSM45NBQOKQ . You are receiving this because you commented.Message ID: @.***>
Seems that the file is missing? Is it in the directory from which you're running the script? Are you running it locally on the same machine? Why there's error from mpirun? How exactly did you run this? śr., 6 kwi 2022, 11:22 użytkownik chao @.> napisał: … When I enter the following command: 'ffindex_apply cluster1091_a3m_wo_ss.ff{data,index} -i cluster1091_a3m.ffindex -d cluster1091_a3m.ffdata -- addss.pl stdin stdout /big/martin/hh-suite/lib/ffindex/src/ffindex_apply_mpi.c:341 ffindex_apply: cluster1091_a3m_wo_ss.ffdata: No such file or directory' there is such an error, how should I solve it, thank you — Reply to this email directly, view it on GitHub <#268 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMIY6YOMDQY3C6LCGY6LVDVJVHANCNFSM45NBQOKQ . You are receiving this because you commented.Message ID: @.>
Hi ksteczk when i run the hhsearch i meet the new issue "could not open file 'msa/HHM/Allterl_hmm_cs219.ffdata', In /big/martin/hh-suite/src/ffindexdatabase.cpp:11: FFindexDatabase:" firstly, i build the db from the all hmm file by ffiindex_build and i get the Allterl_hmm.ffdata and Allterl_hmm.ffindex file. then i query the single hmm. file to the the allter_hmm.ffindex file by hhsearch. but i meet this issue. so can you figure it out? guys. appreciated it ! yours
Seems that the file is missing? Is it in the directory from which you're running the script? Are you running it locally on the same machine? Why there's error from mpirun? How exactly did you run this? śr., 6 kwi 2022, 11:22 użytkownik chao @._> napisał: … When I enter the following command: 'ffindex_apply cluster1091_a3m_wo_ss.ff{data,index} -i cluster1091_a3m.ffindex -d cluster1091_a3m.ffdata -- addss.pl stdin stdout /big/martin/hh-suite/lib/ffindex/src/ffindex_apply_mpi.c:341 ffindex_apply: cluster1091_a3m_woss.ffdata: No such file or directory' there is such an error, how should I solve it, thank you — Reply to this email directly, view it on GitHub <#268 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD2CMIY6YOMDQY3C6LCGY6LVDVJVHANCNFSM45NBQOKQ . You are receiving this because you commented.Message ID: @_._>
Hi ksteczk when i run the hhsearch i meet the new issue "could not open file 'msa/HHM/Allterl_hmm_cs219.ffdata', In /big/martin/hh-suite/src/ffindexdatabase.cpp:11: FFindexDatabase:" firstly, i build the db from the all hmm file by ffiindex_build and i get the Allterl_hmm.ffdata and Allterl_hmm.ffindex file. then i query the single hmm. file to the the allter_hmm.ffindex file by hhsearch. but i meet this issue. so can you figure it out? guys. appreciated it ! yours
I also encountered this problem, did you solve it?
I build the databse from MSAs, first I place all of them in a single folder that does not contain any other files to create a single FFindex database and general two files: 227_msa.ffdata and 227_msa.ffindex, then I yse the command 'OMP_NUM_THREADS=1 mpirun -np 1 ffindex_apply_mpi 227_msa.ff{data,index} -i 227_a3m_wo_ss.ffindex -d 227_a3m_wo_ss.ffdata -- hhconsensus -M 50 -maxres 65535 -i stdin -oa3m stdout -v 0' and I got an error like this:
'mpirun was unable to find the specified executable file, and therefore did not launch the job. This error was first reported for process rank 0; it may have occurred for other processes as well.
NOTE: A common cause for this error is misspelling a mpirun command line parameter option (remember that mpirun interprets the first unrecognized command line token as the executable).
Node: localhost Executable: ffindex_apply_mpi'
So I wonder how to solve this problem