Closed codykingham closed 4 years ago
This would also eliminate another dependency, which we should seek to do as much as possible.
We should't get rid of Sly. It's a good solution. Rather, instead the Sly parser code should be updated and simplified.
We now have good progress on the parser, see https://github.com/CambridgeSemiticsLab/nena_corpus/blob/master/parse_nena/NenaParser2.ipynb
Next steps to implement the parser include:
The new parser is complete in 84453e577c9a63a22c61580ed4cf71f00307a47a. There may be some edge cases to account for in the future. We'll keep an eye on that and update the parser as needed. For now, the bulk of the code is in place.
Currently we use
sly
to parse.nena
texts. Butsly
seems prone to superfluous error messages and delicate to inconsistencies. Would it be better to instead write a class that can ingest and validate.nena
texts withoutsly
? That would give us more control on what the parser should and should not choke on.