ONSdigital / csvw-check

A CLI to validate CSV-Ws (W3C's CSV on the Web standard).
Apache License 2.0
1 stars 1 forks source link

Issue #78 Implementing Number Format Parser #97

Closed robons closed 2 years ago

robons commented 2 years ago

This PR gets basic Unicode UTS-35/TR-35/UAX-35 number format validation functioning within the standards set out by unicode and the W3C CSV-W working group. Notably, all W3C CSV-W tests relating to number format validation now pass successfully.

N.B. It is not required that we implement the full UTS-35 specification. This PR does support the padding, rounding or significant digits functionality, however it is entirely possible to retrofit this functionality at a later point in time, if desired.

Implementations MUST recognise number format patterns containing the symbols 0, #, the specified decimalChar (or "." if unspecified), the specified groupChar (or "," if unspecified), E, +, % and ‰. Implementations MAY additionally recognise number format patterns containing other special pattern characters defined in [UAX35] W3C CSV-W - Number Format Minimum Standards

N.B. This approach parses all numbers to the BigDecimal infinite precision datatype. This isn't likely to be the most performant approach, but it is simple to convert the parsed value from a string into a numeric representation.

Newly created issues: #98, #99

This PR satisfies #90.