Bridgeconn / usfm-grammar

An elegant USFM parser.
https://usfmgrammar.vachanengine.org/
MIT License
36 stars 14 forks source link

Create a relax mode for parsing #52

Closed kavitharaju closed 4 years ago

kavitharaju commented 4 years ago

Write a simpler grammar, with more generalized rules to model USFM It should be able to handle USFM 1.x, 2.x and 3.x And should successfully parse for minor error which the regular mode would reject

kavitharaju commented 4 years ago
const grammar = require(usfm-grammar);

var input = '\****usfm-string****\';

const myUsfmParser = new grammar.USFMParser(input, grammar.LEVEL.RELAXED);
var jsonOutput = myUsfmParser.toJSON();

This relaxed mode provides relaxation of sereval rules in the USFM spec and give you a JSON output for a file that can be considered a workable USFM file. Only the most important markers are checked for, like the \id at the start, presence of \c and \v markers. Though all the given markers are parsered and provided in JSON output, their syntax or the position in the file is not verified. Even a miss-spelled makers would be accepted. So as a word of caution, mistakes may go unnoticed that might even lead to loss of information. For example if the file has mistakenly not given a space between verse marker and verse number, and has \v3 the parser may accept it as a different marker v3 and fail to recognise that it is a verse.