Closed DavidHaslam closed 7 years ago
Please pass the script for fixing this automatically :)
Thanks for finding this.
My observation was made on a concatenated USFM file containing the data for all 66 books.
The corrections should be feasible using a suitable regex search and replace operation.
If I were using TextPipe (in Windows), I'd use the following PCRE replace filter:
Perl pattern [(\\x .+\\x\*)(\\nd\*)] with [$2$$1]
[X] Match case
[ ] Whole words only
[ ] Case sensitive replace
[ ] Prompt on replace
[ ] Skip prompt if identical
[ ] First only
[ ] Extract matches
Maximum text buffer size 4096
[ ] Maximum match (greedy)
[ ] Allow comments
[ ] '.' matches newline
[X] UTF-8 Support
[ ] Process longest strings first
[ ] Simultaneous search
[ ] Log summary only
The search pattern should be non-greedy, as there may be several verses where the pattern occurs more than once.
NB. The replace filter also has to be restricted to within the PCRE pattern \\nd .+\\nd\*
The above TextPipe filter has been written and tested successfully on my merged.usfm
file.
I've not forked the repo so far because I was primarily thinking of analysis tasks rather than making changes.
I'm simply aiming to anticipate issues that would need to be addressed before updating the SWORD module in CrossWire Main. cf. The module Ndebele was made almost 8 years ago.
SwordVersionDate=2009-11-01
TextSource=http://sites.google.com/site/bibletranslationdata/
If you'd like to fork it, at some time in the future, make the changes required, and there would be a pull request, I can examine it and integrate it in the repo. That would be great!
Forked and cloned. Currently working on the systematic fix.
Whew!
In the whole work, there are 653 instances of the patterm
\x*\nd*
where the cross-reference element is within the name of deity marker.Any cross-reference element should come after the
\nd*
marker.cf. There are 50 instances of the correct pattern
\nd*\x
Examples:
Correct syntax:
iN\nd kosi\nd*\x + 3.17.\x*.
Incorrect syntax:eN\nd kosi\x + 19.19. Eks. 33.12,13,16,17. Luka 1.30. Seb. 7.46.\x*\nd*.