Ingest processor doesn't use schema to resolve fields into column names. It relies on table definition and field mangling. This is fine when Quesma manages the table definition, but it can lead to confusion when a human manages the table.
Here is an example:
Let's create a table
CREATE TABLE test (
`@timestamp` DateTime64(3) DEFAULT now64() COMMENT 'quesmaMetadataV1:fieldName=%40timestamp',
`attributes_values` Map(String, String),
`attributes_metadata` Map(String, String),
`foo` Nullable(String) COMMENT 'quesmaMetadataV1:fieldName=bar'
)
ENGINE = MergeTree
ORDER BY `@timestamp`
SETTINGS index_granularity = 8192 COMMENT 'created by Quesma'
2. Insert some data
insert into test(foo) values ('a')('b')('c')
3. Query Quesma
POST localhost:8080/test/_search
{}
4. It's fine for now. We've got 3 rows.
5. Let's ingest
POST localhost:8080/test/_doc
{
"bar": "xxx"
}
6. And query again:
POST localhost:8080/test/_search
{}
7. We've 'lost' some data
8. The table looks like
<img width="1306" alt="Screenshot 2024-11-12 at 13 37 24" src="https://github.com/user-attachments/assets/0ab19041-bd8d-4af7-bd01-28b66a4f3f5e">
9. Schema
<img width="1322" alt="Screenshot 2024-11-12 at 13 38 29" src="https://github.com/user-attachments/assets/382df052-1acd-4f22-b8b4-876aa744017c">
Ingest processor doesn't use schema to resolve fields into column names. It relies on table definition and field mangling. This is fine when Quesma manages the table definition, but it can lead to confusion when a human manages the table.
Here is an example:
insert into test(foo) values ('a')('b')('c')
POST localhost:8080/test/_search
{}
POST localhost:8080/test/_doc
{ "bar": "xxx" }
POST localhost:8080/test/_search
{}