soedinglab / hh-suite

Remote protein homology detection suite.
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3019-7
GNU General Public License v3.0
545 stars 134 forks source link

WARNING in hhblits: maximum number of 65535 sequences exceeded #114

Open ghost opened 5 years ago

ghost commented 5 years ago

Hi,

I am trying to build a custom database. I am getting stuck at this step of the manual (pg 16): mpirun -np ffindex_apply_mpi _fas.ff{data,index} -i _a3m_wo_ss.ffindex -d _a3m_wo_ss.ffdata --hhblits -d <path_to/uniprot20> -i stdin -oa3m stdout -n 2 -cpu 1 -v 0

The error message below comes from a test with 2 sequences:

WARNING: Number of hits passing 2nd prefilter (reduced from 44698 to allowed maximum of 20000). You can increase the allowed maximum using the -maxfilt option.

WARNING in hhblits: maximum number of 65535 sequences exceeded while reading A0A0D8HES9.a3m. Skipping all following sequences of this MSA WARNING: Number of hits passing 2nd prefilter (reduced from 107869 to allowed maximum of 20000). You can increase the allowed maximum using the -maxfilt option.

2 696 515 441978 0 WARNING in hhblits: maximum number of 65535 sequences exceeded while reading A0A0P9JPC5.a3m. Skipping all following sequences of this MSA 1 0 696 884161 0

When this step gives this error, the rest of the steps cannot be completed. There isn't anything in the manual that explains the error. I have tried adjusting the filter options: -id [0,100] and -diff [0,inf] with several different values and still no luck.

See below for the sequences used: The names referenced in the error do not reflect the sequence names, I am also not sure of the reason for that.

A1A5C7 MAIDRRREAAGGGPGRQPAPAEENGSLPPGDAAASAPLGGRAGPGGGAEIQPLPPLHPGGGPHPSCCSAAAAPSLLLLDYDGSVLPFLGGLGGGYQKTLVLLTWIPALFIGFSQFSDSFLLDQPNFWCRGAGKGTELAGVTTTGRGGDMGNWTSLPTTPFATAPWEAAGNRSNSSGADGGDTPPLPSPPDKGDNASNCDCRAWDYGIRAGLVQNVVSKWDLVCDNAWKVHIAKFSLLVGLIFGYLITGCIADWVGRRPVLLFSIIFILIFGLTVALSVNVTMFSTLRFFEGFCLAGIILTLYALRIELCPPGKRFMITMVASFVAMAGQFLMPGLAALCRDWQVLQALIICPFLLMLLYWSIFPESLRWLMATQQFESAKRLILHFTQKNRMNPEGDIKGVIPELEKELSRRPKKVCIVKVVGTRNLWKNIVVLCVNSLTGYGIHHCFARSMMGHEVKVPLLENFYADYYTTASIALVSCLAMCVVVRFLGRRGGLLLFMILTALASLLQLGLLNLIGKYSQHPDSGMSDSVKDKFSIAFSIVGMFASHAVGSLSVFFCAEITPTVIRCGGLGLVLASAGFGMLTAPIIELHNQKGYFLHHIIFACCTLICIICILLLPESRDQNLPENISNGEHYTRQPLLPHKKGEQPLLLTNAELKDYSGLHDAAAAGDTLPEGATANGMKAM A6NFX1 MAAPPAPAAKGSPQPEPHAPEPGPGSAKRGREDSRAGRLSFCTKVCYGIGGVPNQIASSATAFYLQLFLLDIAQIPAAQVSLVLFGGKVSGAAADPVAGFFINRSQRTGSGRLMPWVLGCTPFIALAYFFLWFLPPFTSLRGLWYTTFYCLFQALATFFQVPYTALTMLLTPCPRERDSATAYRMTVEMAGTLMGATVHGLIVSGAHRPHRCEATATPGPVTVSPNAAHLYCIAAAVVVVTYPVCISLLCLGVKERPDPSAPASGPGLSFLAGLSLTTRHPPYLKLVISFLFISAAVQVEQSYLVLFCTHASQLHDHVQGLVLTVLVSAVLSTPLWEWVLQRFGKKTSAFGIFAMVPFAILLAAVPTAPVAYVVAFVSGVSIAVSLLLPWSMLPDVVDDFQLQHRHGPGLETIFYSSYVFFTKLSGACALGISTLSLEFSGYKAGVCKQAEEVVVTLKVLIGAVPTCMILAGLCILMVGSTPKTPSRDASSRLSLRRRTSYSLA Thank you for your help!

~Rach

tamuanand commented 5 years ago

Hi Rach

You might want to check this - https://github.com/soedinglab/hh-suite/issues/51

ghost commented 5 years ago

I tried recompiling. -per the suggestion of #51. I couldn't make that work so I followed the instructions to clone from git for a fresh installation. I followed instructions to use: git submodule init and git submodule update (in the repository root. )

I made a directory build and copied everything from the repository to the directory build, following the steps: mkdir build cd build cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=.../hh-suite/build (full path specified)

output: cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=~/Documents/software_rach_tango/hhpred/hh-suite/build -- The CXX compiler identification is GNU 5.4.0 -- Check for working CXX compiler: /usr/bin/c++ -- Check for working CXX compiler: /usr/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Detecting CXX compile features -- Detecting CXX compile features - done -- Compiler is GNU -- Processor is x86_64 -- The C compiler identification is GNU 5.4.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Looking for fmemopen -- Looking for fmemopen - found -- Found MPI_C: /usr/lib/openmpi/lib/libmpi.so
-- Found MPI_CXX: /usr/lib/openmpi/lib/libmpi_cxx.so;/usr/lib/openmpi/lib/libmpi.so
-- Using CPU native flags for SSE optimization: -march=native -- Performing Test HAVE_MM_MALLOC -- Performing Test HAVE_MM_MALLOC - Success -- Performing Test HAVE_POSIX_MEMALIGN -- Performing Test HAVE_POSIX_MEMALIGN - Success -- Performing Test HAVE_AVX2_EXTENSIONS -- Performing Test HAVE_AVX2_EXTENSIONS - Failed -- Performing Test HAVE_AVX_EXTENSIONS -- Performing Test HAVE_AVX_EXTENSIONS - Success -- Performing Test HAVE_SSE4_2_EXTENSIONS -- Performing Test HAVE_SSE4_2_EXTENSIONS - Success -- Performing Test HAVE_SSE4_1_EXTENSIONS -- Performing Test HAVE_SSE4_1_EXTENSIONS - Success -- Performing Test HAVE_SSSE3_EXTENSIONS -- Performing Test HAVE_SSSE3_EXTENSIONS - Success -- Performing Test HAVE_SSE3_EXTENSIONS -- Performing Test HAVE_SSE3_EXTENSIONS - Success -- Performing Test HAVE_SSE2_EXTENSIONS -- Performing Test HAVE_SSE2_EXTENSIONS - Success -- Performing Test HAVE_SSE_EXTENSIONS -- Performing Test HAVE_SSE_EXTENSIONS - Success -- Found AVXextensions, using flags: -march=native -mavx -mfpmath=sse -Wa,-q -- Try OpenMP CXX flag = [-fopenmp] -- Performing Test OpenMP_FLAG_DETECTED -- Performing Test OpenMP_FLAG_DETECTED - Success -- Found OpenMP: -fopenmp
-- Found OpenMP -- Configuring done -- Generating done -- Build files have been written to: /home/rgarib/Documents/software_rach_tango/hhpred/hh-suite/build

*not sure why this happens** -- Performing Test HAVE_AVX2_EXTENSIONS - Failed if I try to continue with make and make install steps.

Ultimately, I get the following:

-- Install configuration: "RelWithDebInfo" CMake Error at lib/ffindex/src/cmake_install.cmake:42 (file): file INSTALL cannot find "/home/rgarib/Documents/software_rach_tango/hhpred/hh-suite/build/bin/ffindex_apply_mpi". Call Stack (most recent call first): lib/ffindex/cmake_install.cmake:37 (include) cmake_install.cmake:45 (include)

Makefile:148: recipe for target 'install' failed make: *** [install] Error 1

Thanks

milot-mirdita commented 5 years ago

We introduced a -maxseq parameter. You no longer have to recompile the HH-suite to change this parameter. Also compilation should be fixed now.