mvahowe / proskomma-js

A JS Implementation of the Proskomma Scripture Processing Model
MIT License
11 stars 8 forks source link

Which parser for USFM? #119

Closed jdejoode closed 3 years ago

jdejoode commented 3 years ago

Hi @mvahowe

Very quick question:

index.js in the USFM parser looks like it has an argument that is the actual parser: https://github.com/mvahowe/proskomma-js/blob/072d2c25c7f24b8255b894ae173e3fc733938720/src/parser/lexers/usfm/index.js#L9

Which parser should you feed this function? Is that the SAX one define in the usx directory?

I was looking for a test like 'readUSFMfile.js', do you have something that comes close to that?

Thanks

mvahowe commented 3 years ago

Hi @jdejoode!

The parser in this case is https://github.com/mvahowe/proskomma-js/blob/072d2c25c7f24b8255b894ae173e3fc733938720/src/parser/index.js#L13

There's no USFM parser as such. USFM and USX both get lexed into the same set of preTokens, and those preTokens are then fed into the code above. That code defines the overall processing strategy and some arbitrary edge cases, but most of the per-tag logic is defined in https://github.com/mvahowe/proskomma-js/blob/072d2c25c7f24b8255b894ae173e3fc733938720/src/parser/parser_specs.js#L34

The plan for post-1.0 is to clean up the core parser engine and to support multiple parser specs.