Open hrp1000 opened 2 years ago
sorry - forgot to attach .a3m file! Here are the first couple of lines -
ss_dssp CCHHHHHHHHHHHHHHHHHHHTCCSTHHHHHHHHHHHHHHHHHSCTTCCCCTTHHHHHHHHHHHHHHHHHHHHHTTCHHHHHHHHHHHHHHHHTSTTTSPPPTTHHHHHHHHHHHHHHHTCHHHHHHHHHHSCHHHHTSHHHHHHHHHHHHHHHTCHHHHHHHTTSCSSGGGHHHHHHHHHHHHHHHHHHHHHHCSEEEHHHHHHHHTCSSHHHHHHHHHHHCTTCEEETTEEECCCCSSSSSSSCSCHHHHHHHHHHHHHHHHHHC
If I build a DSSP file with the current mkdssp (which corrects along-standing bug in assigning secondary structure, by assigning polyPro helices "correctly" - see https://github.com/PDB-REDO/dssp) I get a bunch of warnings from hhalignment about "Ignoring invalid symbol", followed by an error and hhalignment exits without producing a .hhm file
Really, I have two Qs -
(1) Will this ever be fixed? If not, I can go back to an earlier version of dssp/mkdssp (which one is suitable?)
(2) If I just add type "P" to the inline char ss2i and return 3 in hhutil-inl.h, then re-compile the suite would that fix the problem?
Expected Behavior
no error ;-) - should produce valid hhm file with >ss_dssp and >sadssp lines (NOTE - I have modified addss.pl to process the regexp that corresponds to entries like "c7qvgT" - DSSP files that have no polyPro don't produce an error, and do produce a usable .hhm file).
Current Behavior
Does not produce .hhm file if mkdssp has indicated polyPro SS present.
setenv I c7qvgT_ hhmake -i hmm/${I}.a3m -o hmm/${I}.hhm
11:07:38.681 INFO: hmm/c7qvgT_.a3m is in A2M, A3M or FASTA format
11:07:38.681 WARNING: Ignoring invalid symbol 'P' at pos. 99 in line 2 of hmm/c7qvgT_.a3m
11:07:38.681 WARNING: Ignoring invalid symbol 'P' at pos. 100 in line 2 of hmm/c7qvgT_.a3m
11:07:38.681 WARNING: Ignoring invalid symbol 'P' at pos. 101 in line 2 of hmm/c7qvgT_.a3m
11:07:38.686 ERROR: - 11:07:38.686 ERROR: Error in /bmm/soft/linux64/src/hh-suite-src/hh-suite/src/hhalignment.cpp:1244: Compress:
11:07:38.686 ERROR: sequences in hmm/c7qvgT_.a3m do not all have the same number of columns,
11:07:38.686 ERROR: e.g. first sequence and sequence sa_dssp.
11:07:38.686 ERROR: Check input format for '-M a2m' option and consider using '-M first' or '-M 50'
Steps to Reproduce (for bugs)
setenv I c7qvgT_ hhmake -i hmm/${I}.a3m -o hmm/${I}.hhm
see attached .a3m file - but you only need to see line 2 to confirm that it has "P" at the positions indicated in the warnings
HH-suite Output (for bugs)
11:07:38.681 INFO: hmm/c7qvgT_.a3m is in A2M, A3M or FASTA format
11:07:38.681 WARNING: Ignoring invalid symbol 'P' at pos. 99 in line 2 of hmm/c7qvgT_.a3m
11:07:38.681 WARNING: Ignoring invalid symbol 'P' at pos. 100 in line 2 of hmm/c7qvgT_.a3m
11:07:38.681 WARNING: Ignoring invalid symbol 'P' at pos. 101 in line 2 of hmm/c7qvgT_.a3m
11:07:38.686 ERROR: - 11:07:38.686 ERROR: Error in /bmm/soft/linux64/src/hh-suite-src/hh-suite/src/hhalignment.cpp:1244: Compress:
11:07:38.686 ERROR: sequences in hmm/c7qvgT_.a3m do not all have the same number of columns,
11:07:38.686 ERROR: e.g. first sequence and sequence sa_dssp.
11:07:38.686 ERROR: Check input format for '-M a2m' option and consider using '-M first' or '-M 50'
Context
I'm maintaining the homology modelling software Phyre2, which is still very popular despite the development of AlphaFold - we currently process >1000 sequences a day.
Your Environment
Probably not very relevant, but:
more /etc/redhat-release CentOS Linux release 7.9.2009 (Core)
more /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 106 model name : Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz stepping : 6 microcode : 0xd000311 cpu MHz : 800.048 cache size : 24576 KB physical id : 0 siblings : 32 core id : 0 cpu cores : 16 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 27 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush d ts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon p ebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadl ine_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 invpcid_single intel_pt ss bd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflusho pt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_tot al cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq md_clear pconfi g spec_ctrl intel_stibp flush_l1d arch_capabilities bogomips : 5800.00 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 57 bits virtual power management: . . . for another 63 cores... Include as many relevant details about the environment you experienced the issue in.