EddyRivasLab / hmmer

HMMER: biological sequence analysis using profile HMMs
http://hmmer.org
Other
307 stars 69 forks source link

build warning for 3.3.2 #219

Closed mathog closed 3 years ago

mathog commented 3 years ago

On CentOS 8 with gcc 8.3.1 there is a compiler warning during the build:

export LD_LIBRARY_PATH=/opt/ompi401/lib
export LD_RUN_PATH=/opt/ompi401/bin
export PATH=$PATH:/opt/ompi401/bin #or it cannot find mpicc
./configure --enable-sse --enable-threads --enable-mpi --prefix=/opt/ompi401 2>&1 | tee build_configure.log
make -j 4 2>&1  | tee build_make.log 

which is:


In file included from phmmer.c:9:
phmmer.c: In function ‘serial_master’:
../easel/easel.h:164:17: warning: argument 1 range [18446743987810205696, 18446744073709551576] exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
     if ( ((p) = malloc(size)) == NULL)  { \
                 ^~~~~~~~~~~~
phmmer.c:492:3: note: in expansion of macro ‘ESL_ALLOC’
   ESL_ALLOC(info, sizeof(*info) * infocnt);
   ^~~~~~~~~
In file included from phmmer.c:6:
/usr/include/stdlib.h:539:14: note: in a call to allocation function ‘malloc’ declared here
 extern void *malloc (size_t __size) __THROW __attribute_malloc__ __wur;

The line which triggers this is:

ESL_ALLOC(info, sizeof(*info) * infocnt);

But info seems to be an array pointer for data of type WORKER_INFO, so this dereferences it and would be I guess equivalent to just sizeof(WORKER_INFO). However, making that change still does not suppress the warning. This

ESL_ALLOC(info, (int)sizeof(WORKER_INFO) * infocnt); compiles cleanly and in the resulting programs everything passes "make check".

On an unrelated point, does 3.3.2 have support for NCBI v4 or v5 databases? This was unsupported in 3.2.1 and I'm curious if that has changed. (Documentation still references makeblastdb in nhmmer.man, nhmmer.man.in, and x-psiblast+)

cryptogenomicon commented 3 years ago

Thanks. You can ignore the compiler warning, which appears to be a false positive from gcc 8.3.0 with OpenMPI, not a problem in our code. I've reproduced the warning here, and just checked in a fix that silences the warning. The fix will appear in our next release.

I also removed the references to makeblastdb and ncbi format files in nhmmer.man. We have been unable to support NCBI binary press'ed databases for quite some time; their format is undocumented, last time I checked.

mathog commented 3 years ago

Thanks.

I have asked the NCBI before for v5 documentation and they could never provide a link to any. V5 is documented in some sense by the toolbox code I suppose. For V4 there was actually documentation, and I once wrote my own access code using it. That software and documentation is here:

https://saf.bio.caltech.edu/pub/software/molbio/blastdb_api.tar.gz

At the time the bdb_tool program it provides was about twice as fast as the NCBI's fastacmd at dumping fasta from an NCBI database. However my code is now quite old and does not handle v5 at all or v4 once it grows past certain 32 bit limits. I was thinking about updating it to handle the bigger v4 datasets but then found out there are entries in "nt" now which trigger fatal errors when trying to makeblastdb with parse to a v4 database, even using the current toolbox, that is:

Multi-letters chain PDB id is not supported in v4 BLAST DB

So while the current blast variants all work with v4 or v5, there isn't actually a supported way to make certain v4 databases!

cryptogenomicon commented 3 years ago

Yeah, we just don't have the bandwidth to be able to track NCBI's various formats right now.