Closed osopardo1 closed 1 year ago
Regarding elementCount
Maybe I am wrong the last revision can have (min, max) ranges of the values (later used by liner transformation) which are smaller than the corresponding values ranges of the records from the file. As I remember while indexing if given data does not fit the latest revision space then a new revision is created.
Let me formulate the last item a different way: can we treat files without Qbeast metadata as indexed? Possibly they are indexed badly, but if they do not violate any invariant, then it is safe to add them to index as if they were indexed.
UPDATE
From last conversations, we agreed that:
This issue is a dependency of #102
Fixed on #152
To be more compatible with underlying Table Formats and set up an easier conversion to Qbeast, we should be able to process files that do not have any Qbeast Metadata on them.
For example
This is a File with Qbeast Metadata:
And this is a file without Qbeast Metadata:
One solution could be the following:
When reading the Delta Log and encountering a file with tags, we put the following synthetic metadata:
This means we are putting all the unknown files onto the last revision root cube with a weight range of [MinValue, MaxValue] ([0.0, 1.0]).
Questions/design decisions:
elementCount
? Is it necessary to know the value? If so, how can we compute it without wasting time?