JockLawrie / Schemata.jl

Schemata for tabular data sets in Julia.
Other
18 stars 2 forks source link

Overhaul error correction #21

Open JockLawrie opened 4 years ago

JockLawrie commented 4 years ago

Currently invalid values are set to missing. We can view this as an error correction function: f(invalid_value) = missing.

Allow other functions. E.g., f(invalid_value) = filter(isascii, invalid_value) Use regexes to define other functions.

JockLawrie commented 4 years ago

Use an ordered sequence of remediation steps, analogous to the ordered linkage criteria in SpineBasedRecordLinkage.jl.

Can restrict this feature to regexes. E.g., a regex can filter out non-ascii characters.