Closed gasyoun closed 1 year ago
This problem now corrected. It affects the display in about 500 lines.
The way the links come about in the display is that they are added at run time (basicadjust.php).
The program looks for a regex.
This regex before the correction: ([0-9]+)[ ,]+([0-9]+)
Note match starts with space ' '
The regex after the correction: ([ (])([0-9]+)[ ,]+([0-9]+)
Now match starts with space OR '('.
In your examples, the missing links started with '(' and no space; that's why they were missed. Now the '(' is accepted as valid prior character, so the links are present in the display.
It affects the display in about 500 lines.
Thanks.
A few more.
soma [p= 1579]
soma, m., Soma, Saft der Somapflanze
the '4. 5. 6.' cannot readily be linked. These are different verses of the prior 464,1 link, so the absence of links for 4. etc is not serious.
Notice 483.2. 3 is similar, the 3 is not linked.
the '4. 5. 6.' cannot readily be linked. These are different verses of the prior 464,1 link, so the absence of links for 4. etc is not serious.
That I understand. Harder case. Wonder if VedaWeb hanlded them - but can't find it there. Search they have works worse than ours. Can we count how many unlinked numbers are there?
the '675. 6' link can be corrected. The link doesn't show due to the period '.' which should be a comma. In this case, the period is a typo.
Here is the regex that the display module uses to detect link patterns:
|([ (])([0-9]+)[ ,]+([0-9]+)|
In summary:
So ' 123.4' does NOT match, but ' 123, 4' does match.
There are several similar cases ( regex=[0-9][0-9][0-9][.] [0-9]
) .
gra:10160,soma;616,ap;1582,Ayus;2097,UDar;4544,draviRa;4715,DIra; 5079,nu; 5085,nf; 5382,paSu; 5549,pur; 5905,praTama; 6268,brahman; 6401,Buvana; 8890,SUra; 10036,suzwuti; 10132,sfj
These have been corrected
Some of these corrections are classified as print changes, since the scanned image shows a period. See gra_printchange.txt.
Can we count how many unlinked numbers are there?
I filtered gra.txt (86000+ lines) . Got matches in 2300 lines. First few lines of result (in Emacs):
The highlighted areas match the regex.
In 548,12. 5
we'll get a link at 548,12
, but not a link at the ending 5
.
Some of these corrections are classified as print changes, since the scanned image shows a period.
Thanks, a good catch.
Got matches in 2300 lines.
So at least 2300 unlinked cases, interesting.
I've tagged all such numbers still present in the CDSL version with { }; a simple script by Jim is enough to add new ls entities or to extend ('pad') the existing ls entries suitably.
As such, this issue could be closed once he takes up my file.
34,7
and923,11
should be linked, but are not. @funderburkjim any clue?