Currently if configuration for a dataset is removed from the config file, there may not be sufficient information stored about historical runs to understand what they were run with.
Possibly the executed query and or static and derived metadata about the run should be stored alongside so that this information doesn't disappear.
Expected Outcome
Store relevant configuration metadata alongside each run such as
The source+target queries run
The source+target data source refs (possibly the URLs?)
The recce version #?
This information should be returnable via the API for a completed run
Added metadata column in reconciliation_run table with sourceQuery and targetQuery
Went with JSONB type as the fields inside metadata are unlikely to be individually queried
Added metadata as a property of RecRun, could not add metadata as a constructor parameter as map isn’t a basic type
H2 test database does not have JSONB data type, hence initialised the database with a new init.sql which creates a custom column type and maps jsonb to varchar
Tried text data type but it’s belongs to CLOB in H2 which uses a stream json parser but had error with decoding json stream when there’s more than 256 chars
Added sourceUrl and targetUrl from r2dbc configuration to metadata
Added project version to metadata
Set project version as an environment variable inside gradle
Wondering if there’s a better way of getting project version that works in all use cases
Added gradle task to generate build-info.properties containing project version for running with docker
Was not able to find something similar to the BuildProperties bean in Spring
To retrieve the version property, need to add classpath:build-info.properties to the MICRONAUT_CONFIG_FILES environment variable
This also means that users would have to take note to add this inside their docker run command (but preferably we should reduce the configuration needed by user)
Context / Goal
Currently if configuration for a dataset is removed from the config file, there may not be sufficient information stored about historical runs to understand what they were run with.
Possibly the executed query and or static and derived metadata about the run should be stored alongside so that this information doesn't disappear.
Expected Outcome
Out of Scope
Additional context / implementation notes