ixc / python-edtf

MIT License
53 stars 19 forks source link

Update natural language parser #48

Closed ColeDCrawford closed 6 months ago

ColeDCrawford commented 6 months ago

This PR updates the natural language parser to work with the 2018 spec. As noted in the EDTF docs,

This specification differs from the earlier draft as follows:

  • the unspecified date character (formerly lower case ‘u’) is superseded by the character (upper case) 'X';
  • Masked precision is eliminated;
  • the uncertain and approximate qualifiers, '?' and '~', when applied together, are combined into a single qualifier character '%';
  • “qualification from the left” is introduced and replaces the grouping mechanism using parentheses;
  • the extended interval syntax keywords 'unknown' and 'open' have been replaced with null and the double-dot notation ['..'] respectively;
  • the year prefix 'y' and the exponential indicator 'e', both previously lowercase, are now 'Y' and 'E' (uppercase); and
  • the significant digit indicator 'p' is now 'S' (uppercase).

Checklist for the PR:

ColeDCrawford commented 6 months ago

@aweakley do you want to remove the old parentheses grouping examples from BAD_EXAMPLES here? Those are from the old spec.

Do we have any examples of the difference between null and .. for extended intervals? It doesn't look like the current version of python-edtf has any open syntax?

aweakley commented 6 months ago

Do the parentheses ones make things more complex? I quite like that it's clear from the tests that inputs like that will raise an error, but if it'll make our lives easier to get rid of them then I'm happy with that.

ColeDCrawford commented 6 months ago

Do the parentheses ones make things more complex? I quite like that it's clear from the tests that inputs like that will raise an error, but if it'll make our lives easier to get rid of them then I'm happy with that.

Nope, they fail just fine as they currently are so I'm also happy to leave them.

ColeDCrawford commented 6 months ago

Not sure why the tests are hanging but they look like they are passing if you click in.

I've updated the tests, specifically EDTF level 1 "Qualification of a date (complete)" dates. These parse as UncertainOrApproximate classes.

aweakley commented 6 months ago

This looks great, thank you.