QuesmaOrg / quesma

Programmable database gateway
https://quesma.com
Other
108 stars 6 forks source link

Move ingest transformer earlier in the process #1014

Closed avelanarius closed 1 day ago

avelanarius commented 3 days ago

Before this change, the ingest logic looked roughly like this:

  1. Initial transformation of JSON ("pre ingest transformer", field encodings, ...)
  2. Generating CH table schema based on JSON + config (for "CREATE TABLE" or "ALTER TABLE"s)
  3. Transformation of JSON (transformer object: flattening map and removing ignored fields)
  4. Executing SQL query with the JSON data (after both initial and second transformation)

The issue with this logic is that the transformation (step 3) removes ignored fields, but the step 2 operated on a JSON with those fields present.

Rather than introducing additional logic to step 2 to handle ignored fields, this PR takes another approach: it moves the step 3 before the step 2. This way the CREATE TABLE/ALTER TABLE statements don't take into account the ignored fields.

Moving the transformation earlier allowed for simplifying JsonToColumns. Added two new integration tests.