Open chadlwilson opened 2 years ago
To decide on which option would we like to go with:
Option 1: Basic parsing of SQL query
Option 2: Run whole reconciliation run
Option 3: After completing the run on source database, do the comparison with first row of target database
Option 4: Get the first element from both sources for validation, then all elements for reconciliation
Option 5: Build a separate api for validation of the query
Sharing my thoughts here:
Option 4 and 5 appears to be more elegant solutions as compared Option 2 and 3. Inclined to explore options 1, 4 and 5 further.
Do we want to strike out Option 1 or it is still worth evaluating? It appears to be a good solution if vendor-specific syntax does not cause much complexities.
Option 5 seems like a good candidate too. We could add a flag into the DB to indicate that the validation has executed, and during the recon run, verify the state if the validation. If the validation was not executed, do not continue, and if the validation did execute, continue to recon. And with this, issue #41 should no longer be a concerned .
Option 4 still looks worth evaluating. But I suspect I'm missing something critical considering my lack of understanding with reactive programming. Also, the complexities with issue #41 as mentioned.
Suggestion
How databases supported by r2dbc specify number of records to return:
LIMIT
. Refer to Google Standard SQL.ROWNUM
together with WHERE
LIMIT
LIMIT
SELECT TOP
LIMIT
References
Context / Goal
If the two queries expressed in a dataset have different numbers of columns, they cannot possibly produce matches, when doing a hash-based comparison.
We should ideally fail fast, or at the very least warn the user clearly in the results somehow.
Currently we do not do any query parsing at startup, as this is entirely delegated to the runtime drivers for the relevant databases. We also do not want to introduce a startup connectivity dependency on an given datasource.
Expected Outcome
Evaluate and implement some approach to addressing this
target
dataset. This would have implications for #41 however, and makes things a bit more complex, as the target parsing needs to understand something about the source parsing.Out of Scope
Additional context / implementation notes