Open frannoriega opened 1 year ago
I saw further inconsistencies on the way we are using BitVec
internally to handle bit values.
When we create Literal::BitVector(Vec<u8>)
, we get a BitVec
that comes from parsing the 0
s and 1
s from the bit representation (ie, b'111'
) and then we turn it to bytes by calling to_bytes
Literal::BitVector(bits.to_bytes())
That's a problem, because in order to complete the 8 bits that make a byte, BitVec
adds trailing zeroes. So 111
becomes 11100000
, instead of 00000111
as expected.
Another issue that I've seen, is that we don't have equality for these values when stored in PersistentState
. By this I mean that if we insert 111
, then we have to query with 111
in order to get results; using 0111
(or any other value that still semantically represents 111
) would throw no results.
cc: @readyset-railway
Who was also trying to come up with an explanation for this.
Description
When using ReadySet with MySQL, we are not being consistent on how we decide to interpret bytes/bit values.
Consider this example:
The expected result should be (for both select statements):
For further context, I've seen that:
mysql_common::value::Value
enums intoDfValue
. These enums are very broad, and we are guessing the types at this stage. TheValue::Bytes
type gets converted intoDfValue::Text
(if the bytes can be parsed as a UTF-8 string) orDfValue::ByteArray
(readyset-data/src/lib.rs#1664
). This is how the values are then stored.DfValue::Int
(7
) orDfValue::BitVector
(b'111'
). We attempt to coerce both toDfType::Bit
, but even when successful, it still doesn't match the format we use to store.Commit
79d66f7356ccd29660f696d27f8f74f15e4f6735
Change in user-visible behavior
Any query that predicates upon a
Bit
column in MySQL is highly likely to return incorrect results.Requires documentation change
We don't fully support
Bit
in MySQL