Closed funderburkjim closed 1 year ago
The work is done in lsnum1 directory.
Before this work, 32113 matches in 31909 lines for "<ls>[0-9]"
After 5 days, 29072 matches in 28887 lines for "<ls>[0-9]"
So about 10% done.
change_pwg_1.txt has the first batch of changes.
Estimate about 3 months to complete!
A program was written to automate some changes. About 8000 such changes made. See change_pwg_2.txt.
About 21000 type 1 orphans remain.
change_pwg_3.txt has another 5000 changes. About 16000 type 1 orphans remain.
PWG is top 5, so no work on it's markup can't be too much, thanks, Jim!
Estimate about 3 months to complete
Huge one.
change_pwg_4.txt has another 5000+ changes. About 9700 type 1 orphans remain.
change_pwg_5.txt has another 4000+ changes. About 6000 type 1 orphans remain.
change_pwg_6.txt has another 4000+ changes. About 4000 type 1 orphans remain.
change_pwg_7.txt has another 3600+ changes. About 1500 type 1 orphans remain.
Now there remain only a handful of type 1 numeric orphans.
It is certain that, in the reduction of cases from the initial number of 32000 to the current 58, some errors were made.
In this exercise, many other markup deficiencies were noticed, and hopefully will be addressed in the future. But for now a break from this tedious work is in order. Closing this issue.
is there a way that someone else can access the "temp_xxx" file(s), @funderburkjim ?
I would like to see if I can be of some help on the 22 cases mentioned in the readme file--
File temp_lsnum1_1.txt shows 81 items
- 59 are unresolved <ls>{number} -- nothing more to do with these
- the rest (22) are errors in ls
@Andhrabharati those 22 were resolved by me. See the last two sections (temp_change_pwg_8x and 8x1) of change_pwg_8.txt
those 22 were resolved by me.
But still others left @funderburkjim ?
This work aims primarily to improve markup in cases like:
Sometimes these orphans are hidden in long strings of numbers, such as
The current work handles many of these hidden numeric orphans, but only incidentally. The focus of this work is on the (unhidden) numeric orphans. Additional work (with somewhat different techniques) will be required to resolve the hidden numeric orphans.