Closed gortiz closed 7 months ago
Merging #11907 (e11947f) into master (55f29fe) will increase coverage by
0.00%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #11907 +/- ##
=========================================
Coverage 61.42% 61.43%
+ Complexity 1147 1146 -1
=========================================
Files 2375 2375
Lines 128530 128531 +1
Branches 19853 19853
=========================================
+ Hits 78944 78957 +13
+ Misses 43884 43867 -17
- Partials 5702 5707 +5
Flag | Coverage Δ | |
---|---|---|
custom-integration1 | <0.01% <0.00%> (ø) |
|
integration | <0.01% <0.00%> (ø) |
|
integration1 | <0.01% <0.00%> (ø) |
|
integration2 | 0.00% <0.00%> (ø) |
|
java-11 | 61.40% <100.00%> (+0.04%) |
:arrow_up: |
java-21 | 34.73% <0.00%> (-26.56%) |
:arrow_down: |
skip-bytebuffers-false | 61.42% <100.00%> (+0.01%) |
:arrow_up: |
skip-bytebuffers-true | 34.71% <0.00%> (-26.55%) |
:arrow_down: |
temurin | 61.43% <100.00%> (+<0.01%) |
:arrow_up: |
unittests | 61.42% <100.00%> (+<0.01%) |
:arrow_up: |
unittests1 | 46.65% <0.00%> (+0.01%) |
:arrow_up: |
unittests2 | 27.58% <100.00%> (-0.01%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Files | Coverage Δ | |
---|---|---|
...l/recordtransformer/TimeValidationTransformer.java | 77.77% <100.00%> (+0.50%) |
:arrow_up: |
... and 16 files with indirect coverage changes
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
As said in the title, this PR modifies
TimeValidationTransformer
to mark rows as invalid in case primary time column is out of range.TimeValidationTransformer
has always verified that the received row contains a value for the primary time column that is between 1971 (inclusive) and 2071 (exclusive). In case the value is outside this range,TimeValidationTransformer
does:NullValueTransformer
to the millis since epoch at ingestion timetableConfig.ingestionConfig.continueOnError
is false, aborts the executionThe log in point 3 is not useful. In case it is disabled, no log is present. In case it is enabled, a log is printed for each invalid row. In cases where the error is present in most rows (like for example when the row contains seconds from epoch but schema is defined as millis from epoch) this log is very spammy.
We already supported a way to mark a row as incorrect which is used by most transformers but not
TimeValidationTransformer
. This PR modifiesTimeValidationTransformer
to do so.Important to know
This PR changes the semantics of the ingestion. Before this PR, if the primary time column contains an invalid value, the field was marked as null and the value of
NOW()
at ingestion time was stored as default value.After this PR, if the primary time column contains invalid value, the value is replaced with the default value. If it is real-time consumed,
ServerMeter.INCOMPLETE_REALTIME_ROWS_CONSUMED
will be increased.cc @Jackie-Jiang @swaminathanmanish @snleee