Open GDRom opened 2 years ago
Right now this is controlled by: https://github.com/direct-phonology/jdsw/blob/50b6e6f50673891f5f880a18ff95f49e04dca472/bin/lib/phonology.py#L70-L71
where readings_for()
just checks our SBGY file, so I could easily have it output a warning when rejecting things.
Do you think it's appropriate in this case to just add the readings to your GDR-SBGY-full.csv
? If not, probably not too difficult to expand the Reconstruction
class in lib/phonology.py to allow augmenting the sound table with more information after it's constructed.
Thanks for pointing to the right line of code for this. Yeah, adding such a warning output would be great so I can analyze where things clash between the SBGY and the JDSW.
As for whether or not to append these readings to GDR-SBGY-full.csv
-- I think better not, as I think that data should be left as is. The example I provided above should not occur in the vast majority of medieval texts, nor in "regular" Han texts, but only in that domain of texts that is decidedly archaic (shangshu, maoshi, yijing etc.). I'd estimate that the same will be true for most, if not all, "correct" readings that are omitted in the SBGY.
I'd hence say those are clear exceptions; perhaps a separate readings_exceptions.csv
or so might be a better place to store those?
cool, readings_exceptions
or readings_archaic
or something makes sense to me.
There are instances in which LDM's JDSW correctly notes a reading not included in the SBGY. Our current approach fails to take these instances into account. These instances are rare, however, and often tied to archaic texts (like the Shangshu).
Examples thus far encountered include:
Suggested approach: