Open davidbaines opened 1 year ago
This fix might also resolve Issue #157
Hi @davidbaines, can you verify whether or not Strong's numbers get removed now with with the updates to translate.py? Since bulk_extract_corpora.py was already using the machine.py parser like you said, I would assume that there is no longer an issue.
At least three scripts have to extract verse data from Paratext projects or USFM files. translate.py, bulk_extract_corpora.py and extract_corpora.py
translate.py does not properly remove Strong's numbers from projects that include them, where as the bulk_extract_corpora script does. It would be good to check that all the parts of the pipeline that extract verse text from USFM use the same code.