kowainik / tomland

🏝 Bidirectional TOML serialization
https://kowainik.github.io/posts/2019-01-14-tomland
Mozilla Public License 2.0
120 stars 39 forks source link

Hexidecimal value with underscore #331

Closed dariodsa closed 4 years ago

dariodsa commented 4 years ago

Hi, according to the TOML-0.5.0 version (link) hexadecimal, binary and octal values were introduced. I checked your test modules and you are checking for case insensitivity, negative, non-negative, etc but you missed one sentence in the specification.

Underscores are allowed between digits (but not between the prefix and the value).

So, for example, the following code snippet should pass:

*> Toml.Parser.parse $ "hex3 = 0xdead_beef"
*> Left (TomlParseError {unTomlParseError = "1:19:\n  |\n1 | hex3 = 0xdead_beef\n  |                   ^\nunexpected end of input\nexpecting '-', '.', '=', '_', or alphanumeric character\n"})

The same problem is with binary and octal numbers, classic decimal numbers support underscores.

I noticed the issue #199 is still open. If you want I can look into that and made test cases which will check all changes made from version 0.4.0 to 0.5.0.

chshersh commented 4 years ago

Hi @dariodsa, thanks for reporting the issue! This indeed looks like a bug. It would be nice to add test-cases for such situations and to fix the problem itself, of course 🙂

Regarding issue #199: I've opened issue in the TOML repo itself a while ago about providing testable example:

But there will be no example in the TOML repo. Instead, the implementation needs to be verified with the official compliance test-suite:

So instead of walking through the examples in documentation, it's much better to perform automatic testing, as described in #167. But this also requires more work. I've also closed #199 as a duplicate of #167 due to some recent changes in the toml-lang org.

dariodsa commented 4 years ago

Ok, I will add test cases for such situations and try to fix the problem.

chshersh commented 4 years ago

@dariodsa No worries and thanks for the help!

Just to give some more context, here is the parser for all integer numbers:

https://github.com/kowainik/tomland/blob/316c893a10d1dc652bbcc46f7f8862974941244e/src/Toml/Parser/Value.hs#L44-L52

For binary, octal and hexadecimal numbers the primitive parsers from megaparsec are reused. That's why they don't support underscores. The implementation needs to do something similar to the decimal parser:

https://github.com/kowainik/tomland/blob/316c893a10d1dc652bbcc46f7f8862974941244e/src/Toml/Parser/Value.hs#L33-L42

But I imagine this will be quite a lot of work...

chshersh commented 4 years ago

Fixed by #332