Closed DavidHaslam closed 7 years ago
I have just fixed this systematically in my fork of master.
I have a TextPipe filter that can process all 66 files systematically.
See also issue #26
After the recent merge #81 and any others like it, before I make my concatenated USFM file, or convert the USFM files to OSIS XML, I generally run my TextPipe filter to implement this improvement locally.
This was fixed by merging pull request #96
A search of the concatenated USFM files for the regxp pattern \x20{2,} gave 1678 matches.
Most of such instances of "multiple whitespace" is at the end of line position. However, there are some instances where it occurs "mid-verse". Some occurs in section headings between the \s tag and the text.
Many Unicode text editors have a facility to replace each instance of "multiple whitespace" by a single space. This should also get rid of the spurious 9 tab characters.