Open Diaoul opened 9 years ago
@Toilal @wackou: I want to create my own robust subtitle parser and will likely create a new library for that that handles various formats. I'm looking for the right tool for the job, all subtitles formats seem to have a defined grammar that makes parsing easily possible. There are various technologies for that (PEG parsers, lexers such as LEX or YACC) and so on. Would you recommend one for that kind of work?
I saw various tools such as pyparsing, PLY, pyPEG and parsimonious. I wonder if rebulk would be able to do that? There's no decision making so I think it's not the right tool. There is also the possibility to have my own basic parser based on str and re.
Ideas are welcome :fish_cake:
Do you have examples and/or specs for those formats ?
Rebulk can be used for "short input" and "pseudo-natural" language. I don't think it's the write tool to parse a structured file. It's designed to define patterns (string, regex or functional) than will be scanned in the whole input string, retrieve consistent match objects from those different type of patterns, and filter out false positives with rules implying relations between those matches.
I've never used mentioned parsers in python sorry :)
You can find examples here:
Will require to switch to pycaption for validationNot compatible with python 3, abandoned project?Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.