Closed funderburkjim closed 7 years ago
The cintAmaRi error mentioned above is corrected. It is now one of the extra headwords associated with tattvacintAmaRi
It is now one of the extra headwords associated with tattvacintAmaRi
Do we have a single .txt file where all these additional words are intermingled with original ones? Not on web, but in a text or XML document, so I can get a full combined list of them, Jim?
@gasyoun is asking to regenerate sanhw1.txt and sanhw2.txt in nutshell. :-)
For acc, the file acchw.txt has all the headwords, normal and alternate.
As mentioned, these are also in sanhw1/2, which have been regenerated.
This issue was discovered in the course of working on another issue.
A correction needs to be made to ACC, but preliminary work is required in order to do this change while maintaining stability of L-numbers.
cintAmaRi obstacle and L-numbers
During examination of the cases mentioned above, the following headword error was noticed:
While this cintAmaRi possibly should be classified as an Alternate headword, it definitely should not be a normal headword.
One aspect of correcting this is simple:
However, if this change flows through the current system, then we'll have a shift of L-numbers for all the thousands of headwords following this dropped headword. This is because in the current system for acc, the L-numbers are determined dynamically based on the
<HI>
sequence number in acc.txt.We've decided that fixed L-numbers are better than dynamic L-numbers. This is a goal for all dictionaries. But currently this goal is implemented only for SCH (recently) and MW.
We should adapt the SCH scheme to ACC before making this correction and other corrections to acc.txt.
We should think about some of the details of this before jumping into code changes. A discussion of this is in #130.
When the details of headword coding are decided on in #130, and have been implemented in acc.txt, then will be the time to return and make this cintAmaRi correction.