Closed pbolduc closed 6 years ago
A nice feature would be to raise an event when column parsing or formatting fails, just like when a whole record fails. There'd be an IsHandled
property as well as a Substitution
property (with default (T)
as the default).
Thoughts?
Originally, I was thinking you could use the Preprocessor
on each colum definition, but that feels more like a hack.
I was looking at the changes you have committed. I like the direction you are taking.
Yeah, unfortunately that includes some breaking changes, but I think it's good to rename the Error
event to RecordError
for clarity anyway. I also added similar events to the writer classes, since currently it looks like I am handling record-level errors by throwing ArgumentException
. Yuck.
Benchmarks indicate the extra column events do not have any impact on performance when not used, which is good. I just need to create some additional tests and I think I will be happy.
I am debating doing another major release right after v3 came out. I might just make these 3.1.0 or something.
I think due to renaming Error to RecordError breaks semver, so 3.1 maybe not a great for consumers, however, there a few downloaders at present (less than 50 per version). It could be ok in the short term. From a consumer perspective, it would be good to know the maintainer(s) understand the impact of making break changes in minor version releases.
With the IsNullable
enhancements at the IColumnDefinition
level, I think a v4 is justified. I'm still working on some of the naming (What do you call a method that returns a default value when a null is encountered for a non-nullable column?) but code-wise I think it is complete. I'll start taking a look at your PRs and slate a release once everything's in place.
A method that returns a default value when a null is encountered for a non-nullable column:
in Linq these methods are SingleOrDefault(), FirstOrDefault(), etc If these are public API points, this naming convention should feel natural for users.
The C# guide may provide some ideas default value expressions (C# programming guide)
I guess I am thinking of it more like a DEFAULT constraint in a database:
CREATE TABLE Person
(
-- ...
CreatedOn DATETIME NULL DEFAULT(GETUTCDATE())
)
Then I just call the method to get the substitute GetDefaultValue
or something like that. I have all the code working and am stuck coming up with a name. I'll figure it out.
A method that returns the default value from a column definition? i.e., IColumnDefinition.GetDefaultValue() ? Is there a way to set that on a per instance of the column definition? Different columns, I could see I would want to have different default values based on the column. the base default would be default(T).
new StringColumn("FirstName") // default is string.Empty
new StringColumn("Occupation", DefaultValue = "not specified");
Property or function for the default value?
On a project coming up, we will be receiving multiple files from different sources. We need to validate these files before processing them. For example, ensure number fields have numbers and date/time fields have date/times.
Approaches:
In my use case, I would want to fail the entire file. However, I can think other use case would be people wanting to skip invalid records.
What is your take on validating a flat file for a typed schema?
I will have to think how I would expect an API to work, ie separate SeparatedValueValidatorReader, or option to throw on parse error, add/avoid more responsibilties on the IReaderWithMetadata types.