Closed roll closed 7 years ago
@Stephen-Gates Could you please take a look? I've improved some error messages (syncing some with Data Quality Spec). But still it's not exactly the same in high-level because various reasons like (some of this problems are solvable):
extra/missing-header
errorsfield.castRow
to say on which row number is errorgoodtables
we handle constraints here together not one-by-oneSo here is a question how far we should improve it. Is it a real blocker for data-curator
. Problem here that error messages here is a exceptional situation messages. It's not intended to be user-facing in general (e.g. app on top should catch it). Unlike to goodtables
where all error messages are intended to be user-facing.
So looking forward to your comments. We still have some reserves to improve some messages more (investing more time). Just looking forward to hear what's blockers/priorities. We could handle it in this PR or open individual issue for every new requested improvement.
cc @pwalsh
List of all data related exceptions:
The value "${value}" in column "${this.name}" is not type "${this.type}" and format "${this.format}"
Field "${this.name}" has constraint "${name}" which is not satisfied for value "{value}"
Row length ${row.length} doesn't match fields count ${this.fields.length}
Table headers don\'t match schema field names
Row ${rowNumber} has unique constraint violation in column "${cache.name}"
Foreign key "${foreignKey.fields}" violation in row ${rowNumber}
@roll I will have a good look over the weekend.
@roll
Ideally the row and column for each error would be reported. This would enable Data Curator to:
If this is not possible, then returning the row in error would enable the row to be highlighted.
In terms of priority:
I'll add a feature request and comment separately on error message wording.
@Stephen-Gates That's what I was thinking for. Error messages may be are not the best thing to rely on. But the lib could provide an API for error context when possible e.g:
Than it could be re-used in the higher-level app (including an ability to compose a message using data quality spec). I think it could be more reliable way to achieve the same goal.
Sounds like a good approach @roll
@roll sorry for the delay. Assuming no other variables are available for use in the error messages, then...
The value "${value}" in column "${this.name}" is not type "${this.type}" and format "${this.format}"
appears to be as close as you can get to the Type or Format DQS error message in DQS.
To better match the format of most constraint error messages in DQS, I suggest
Field "${this.name}" has constraint "${name}" which is not satisfied for value "{value}"
is rewritten to,
The value "{value}" does not conform to the "${name}" constraint for field "${this.name}"
There's no equivalent message in DQS. The closest is the extra/missing value error messages in DQS that compares the row length to the header row. This error appears to be comparing the row length to the number of fields in the schema.
I suggest
Row length ${row.length} doesn't match fields count ${this.fields.length}
is rewritten to,
The row with ${row.length} columns does not match the ${this.fields.length} fields in the schema
Perhaps columns
should be values
?
To better match the format of most non-matching header error message in DQS, I suggest
Table headers don\'t match schema field names
is rewritten to,
The column header names do not match the field names in the schema
Row ${rowNumber} has unique constraint violation in column "${cache.name}"
edit that should be applied to DQS and here,
Row ${rowNumber} has a unique constraint violation in column "${cache.name}"
Not implemented in DQS (although there is a suggestion). I don't think I can improve
Foreign key "${foreignKey.fields}" violation in row ${rowNumber}
I did consider:
The foreign key "${foreignKey.fields}" in row ${rowNumber} has a foreign key violation
The foreign key "${foreignKey.fields}" in row ${rowNumber} does not have a matching primary key value
@Stephen-Gates Thanks. It's just a great help.
@Stephen-Gates Please take a look. I've:
error.rowNumber/columnNumber
properties (if available)@roll will review this week. thanks.
@Stephen-Gates Thanks! I've fixes the last one.