There are a bunch of warnings piling up with GCC 13, which I use personally, which don't show up in CI (with GCC 11).
Maybe this is too much for one PR, but it's mostly minor things.
There are two bugs I found when going through the warnings:
CastArrayHelper::checkCompatibleNestedTypes was falling through. This meant it was allowing MAP/STRUCT to be cast to LIST/ARRAY. STRUCT to LIST/ARRAY produces an unexpected error if you try that isn't very helpful, but it is actually possible to successfully cast MAP(<T1>, <T2>) to a STRUCT(KEY <T1>, VALUE <T2>)[].
ParquetTimeStampUtils::impalaTimestampToMicroseconds was casting the impala timestamp Int96 data to a pointer and dereferencing it. I'm fairly sure it's not supposed to be storing a pointer, and I don't think this code is covered by tests. Additionally, using a pointer to cast from two int32_t to one int64_t also throws an aliasing error. As far as I can tell it's being used to store raw data read from disk, and the cast should be safe as long as the endianness matches (I don't know if parquet has any endianness requirements, but generally we assume everything is native-endian), so I switched it to use memcpy to work around the aliasing error.
Other changes:
I added a RUNTIME_CHECK macro which unlike KU_ASSERT doesn't expect a boolean expression but makes for a nicer way of disabling extra code only needed for checks to avoid unused variable warnings.
blob.cpp: signed char is always <= 127
There are a few places with harmless implicit fallthrough where I either added break or [[fallthrough]].
LogicalTypeID is forward-declared without KUZU_API in database.h, so there's a warning that the KUZU_API annotations don't work if the type has been previously declared. Since it's header-only, I don't think KUZU_API is actually necessary, but I'll see if it passes Windows CI first.
function.h: A subclass was using the default operator=, which produces a warning if there is a non-default copy constructor. But the explicit copy constructor is identical to the default so I just made it default.
plan_printer.h (similar elsewhere): Unnecessary checks that unsigned values are >= 0.
npy_reader.cpp: Value may be uninitialized. Not really necessary to change, but initializing it isn't exactly going to slow anything down and it silences the warning.
compression.cpp: A parameter is used only by a runtime check. I added a KU_UNUSED, which is just a macro for a (void)expr cast to hide that warning in a way that clearly indicates the purpose (it should only be used in cases like this, otherwise the parameter name can be commented out or removed).
list_column.cpp: numValues is unsigned since the start and end variables are unsigned, but it makes more sense to look at the inputs to check for underflow rather than the result of the subtraction.
system_config_test.cpp: BufferManagerException is polymorphic and shouldn't be caught by value (not that it actually has any subclasses).
Thrift.h: std::iterator is deprecated
embedded_shell.cpp: Reference to a temporary since keywordList is a vector<const char*> and the strings are being created on the fly.
ParseTree.h: operator== doesn't really need to be virtual since it's never overridden. One subclass has an operator== with a different signature that hides the original and was producing a warning.
constant_time.cpp: gcc was complaining that "-Wdeprecated-volatile" is unknown. Presumably because it's C code and the warning is only enabled in C++ mode (they're all cpp files, but we're setting the C_STANDARD 99 property via CMake). Update: Clang doesn't appear to have this distinction, I'll see if I can suppress it in a clang-specific way.
There are a bunch of warnings piling up with GCC 13, which I use personally, which don't show up in CI (with GCC 11). Maybe this is too much for one PR, but it's mostly minor things.
There are two bugs I found when going through the warnings:
CastArrayHelper::checkCompatibleNestedTypes
was falling through. This meant it was allowingMAP
/STRUCT
to be cast toLIST
/ARRAY
.STRUCT
toLIST
/ARRAY
produces an unexpected error if you try that isn't very helpful, but it is actually possible to successfully castMAP(<T1>, <T2>)
to aSTRUCT(KEY <T1>, VALUE <T2>)[]
.ParquetTimeStampUtils::impalaTimestampToMicroseconds
was casting the impala timestampInt96
data to a pointer and dereferencing it. I'm fairly sure it's not supposed to be storing a pointer, and I don't think this code is covered by tests. Additionally, using a pointer to cast from twoint32_t
to oneint64_t
also throws an aliasing error. As far as I can tell it's being used to store raw data read from disk, and the cast should be safe as long as the endianness matches (I don't know if parquet has any endianness requirements, but generally we assume everything is native-endian), so I switched it to usememcpy
to work around the aliasing error.Other changes:
RUNTIME_CHECK
macro which unlikeKU_ASSERT
doesn't expect a boolean expression but makes for a nicer way of disabling extra code only needed for checks to avoid unused variable warnings.blob.cpp
: signedchar
is always<= 127
break
or[[fallthrough]]
.LogicalTypeID
is forward-declared withoutKUZU_API
in database.h, so there's a warning that theKUZU_API
annotations don't work if the type has been previously declared. Since it's header-only, I don't thinkKUZU_API
is actually necessary, but I'll see if it passes Windows CI first.function.h
: A subclass was using the defaultoperator=
, which produces a warning if there is a non-default copy constructor. But the explicit copy constructor is identical to the default so I just made it default.plan_printer.h
(similar elsewhere): Unnecessary checks that unsigned values are>= 0
.npy_reader.cpp
: Value may be uninitialized. Not really necessary to change, but initializing it isn't exactly going to slow anything down and it silences the warning.compression.cpp
: A parameter is used only by a runtime check. I added aKU_UNUSED
, which is just a macro for a(void)expr
cast to hide that warning in a way that clearly indicates the purpose (it should only be used in cases like this, otherwise the parameter name can be commented out or removed).list_column.cpp
: numValues is unsigned since the start and end variables are unsigned, but it makes more sense to look at the inputs to check for underflow rather than the result of the subtraction.system_config_test.cpp
:BufferManagerException
is polymorphic and shouldn't be caught by value (not that it actually has any subclasses).Thrift.h
:std::iterator
is deprecatedembedded_shell.cpp
: Reference to a temporary since keywordList is avector<const char*>
and the strings are being created on the fly.ParseTree.h
:operator==
doesn't really need to be virtual since it's never overridden. One subclass has anoperator==
with a different signature that hides the original and was producing a warning.constant_time.cpp
: gcc was complaining that"-Wdeprecated-volatile"
is unknown. Presumably because it's C code and the warning is only enabled in C++ mode (they're all cpp files, but we're setting theC_STANDARD 99
property via CMake). Update: Clang doesn't appear to have this distinction, I'll see if I can suppress it in a clang-specific way.