torognes / vsearch

Versatile open-source tool for microbiome analysis
Other
643 stars 123 forks source link

Build errors with GCC 13 on Debian 13 #525

Closed torognes closed 10 months ago

torognes commented 1 year ago

Some warnings and errors when building on Debian 13 (Trixie) with GCC 13:

http://qa-logs.debian.net/2023/05/22/logs/vsearch_2.22.1-1_unstable_gccexp.log

frederic-mahe commented 1 year ago

Specifically:

derep.cc:65:8: warning: type ‘struct bucket’ violates the C++ One Definition Rule [-Wodr]
   65 | struct bucket
      |        ^
derepsmallmem.cc:65:8: note: a different type is defined in another translation unit
   65 | struct bucket
      |        ^
derep.cc:67:12: note: the first difference of corresponding definitions is field ‘hash’
   67 |   uint64_t hash;
      |            ^
derepsmallmem.cc:67:11: note: a field of same name but different type is defined in another translation unit
   67 |   uint128 hash;
      |           ^
derep.cc:65:8: note: type ‘uint64_t’ should match type ‘struct uint128’
   65 | struct bucket
      |        ^
torognes commented 1 year ago

The issues with the use of struct bucket in both derep.cc and derepsmallmem.cc have been fixed in commit afc3f53.

torognes commented 1 year ago

There were also some other errors, like this one:

rm -f libcityhash.a
ar cru libcityhash.a libcityhash_a-city.o 
ar: `u' modifier ignored since `D' is the default (see `U')
ar: libcityhash_a-city.o: plugin needed to handle lto object
ranlib libcityhash.a
ranlib: libcityhash.a(libcityhash_a-city.o): plugin needed to handle lto object

These seem related to a missing plugin for ar and ranlib, which I haven't seen before. It may be related to the new GCC 13 and related tools.

The warning about the u option to ar is not new nor critical, but it would be good to get rid of it.

frederic-mahe commented 1 year ago

The issues with the use of struct bucket in both derep.cc and derepsmallmem.cc have been fixed in commit afc3f53.

I confirm there are no errors when compiling on a Debian unstable with GCC 13

torognes commented 1 year ago

Good! I think I'll try to set up a system with Docker or Podman containers to run compilation on various platforms.

frederic-mahe commented 10 months ago

The warning about the u option to ar is not new nor critical, but it would be good to get rid of it.

I've created a separate issue (see issue #531).

These seem related to a missing plugin for ar and ranlib, which I haven't seen before. It may be related to the new GCC 13 and related tools.

That plugin issue is not present on Ubuntu 23.04 with GCC 12 or 13, nor on Ubuntu 23.10 beta with GCC 13, nor on Debian unstable (2023-08-08) with GCC 13. It seems to be a temporary issue that has been fixed upstream.

I think I'll try to set up a system with Docker or Podman containers to run compilation on various platforms.

I have access to a x86-64 GNU Linux environment with all major GCC versions (4 to 13) and some clang versions.

Since GCC 12, these are the only warnings when compiling vsearch:

In function 'SHA1_Update',
    inlined from 'SHA1_Final' at sha1.c:245:5:
sha1.c:220:13: warning: 'SHA1_Transform' reading 64 bytes from a region of size 1 [-Wstringop-overread]
  220 |             SHA1_Transform(context->state, data + i);
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sha1.c:220:13: note: referencing argument 2 of type 'const uint8_t[64]' {aka 'const unsigned char[64]'}
sha1.c: In function 'SHA1_Final':
sha1.c:131:6: note: in a call to function 'SHA1_Transform'
  131 | void SHA1_Transform(uint32_t state[5], const uint8_t buffer[64])
      |      ^~~~~~~~~~~~~~
In function 'SHA1_Update',
    inlined from 'SHA1_Final' at sha1.c:247:9:
sha1.c:220:13: warning: 'SHA1_Transform' reading 64 bytes from a region of size 1 [-Wstringop-overread]
  220 |             SHA1_Transform(context->state, data + i);
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sha1.c:220:13: note: referencing argument 2 of type 'const uint8_t[64]' {aka 'const unsigned char[64]'}
sha1.c: In function 'SHA1_Final':
sha1.c:131:6: note: in a call to function 'SHA1_Transform'
  131 | void SHA1_Transform(uint32_t state[5], const uint8_t buffer[64])
      |      ^~~~~~~~~~~~~~

Ideally, we would need a similar set-up with multiple GCC versions on a ARM64 machine.

torognes commented 10 months ago

I understand why the warning appears when compiling the code in sha1.c, but I think the problem will never arise due to the program logic. The code checks the length of the input and never reads more than the allowed number of bytes.

I am reluctant to change the code, but we could turn off the warning just for the affected lines (line 245 and onwards) in sha1.c like this:

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wstringop-overread"
    SHA1_Update(context, (uint8_t *)"\200", 1);
    while ((context->count[0] & 504) != 448) {
        SHA1_Update(context, (uint8_t *)"\0", 1);
    }
#pragma GCC diagnostic pop

This may however cause a warning with old compilers that do not understand this PRAGMA or does not support the -Wstringop-overread option.

If the code should be changed we could create two 64-byte buffers containing a single relevant byte each. The first one is the octal \200 character (equal to 128 in decimal) which is used as the first character in a padding of the buffer, and the other is just zero used as padding the rest.

torognes commented 10 months ago

Alternatively, we could change the code like this:

diff --git a/src/sha1.c b/src/sha1.c
index 6a52be7..a071abd 100644
--- a/src/sha1.c
+++ b/src/sha1.c
@@ -237,14 +237,21 @@ void SHA1_Final(SHA1_CTX* context, uint8_t digest[SHA1_DIGEST_SIZE])
 {
     uint32_t i;
     uint8_t  finalcount[8];
+    uint8_t padding_buffer[64];
+
+    for (i = 0; i < 64; i++) {
+      padding_buffer[i] = 0;
+    }

     for (i = 0; i < 8; i++) {
         finalcount[i] = (unsigned char)((context->count[(i >= 4 ? 0 : 1)]
          >> ((3-(i & 3)) * 8) ) & 255);  /* Endian independent */
     }
-    SHA1_Update(context, (uint8_t *)"\200", 1);
+    padding_buffer[0] = 0x80;
+    SHA1_Update(context, padding_buffer, 1);
+    padding_buffer[0] = 0x00;
     while ((context->count[0] & 504) != 448) {
-        SHA1_Update(context, (uint8_t *)"\0", 1);
+        SHA1_Update(context, padding_buffer, 1);
     }
     SHA1_Update(context, finalcount, 8);  /* Should cause a SHA1_Transform() */
     for (i = 0; i < SHA1_DIGEST_SIZE; i++) {

Must be tested too see that the sha1 digests are identical.