Closed michaelnmmeyer closed 6 months ago
Thanks, I have not heard this before. I'm very happy about lifting the part about using n to identify the repository. As for the filenames, I think it would be better to keep them systematic and use the full prefix all the time, perhaps also the .xml extension. That way there is, I think, a better chance of interpreting the files if someone else comes across them years from now and does not necessarily have access to your processing. If you disagree with that and say that it is technically not a problem, then I accept that, but even then, I would prefer to have just a single "legitimate" way to encode such references. I don't think it's good to have a guide that says "you can do A, B, C or D and it doesn't matter which, so just pick whichever you like at the moment".
So if you are sure that using the full filename (DHARMA_INSPallava00001.xml
) is not better practice or more more future-proof in any way than the others, then let's pick the simplest form (INSPallava00001
) as the only approved solution; otherwise, let's use the full filename and make the schema flag anything else as an error instead of processing it to make it work.
Whichever is chosen, the schema may need some alterations. At the moment, @n still comes up in Oxygen as a permitted attribute for <ref>
, and I think there are no other circumstances in which one would want an n on a ref. So that can be discarded, and the existing instances of n on ref can be deleted from the files. For ref, I also get a suggestion list consisting of the filenames in the same folder as the present file, which I like, since most of my crossreferences are to my own subcorpus. But if we enforce a short reference instead of the full filename, then this needs to be changed.
Using the full file name is indeed probably the safest bet, if only for the autocomplete feature you pointed out. This might also discourage people from inventing file names; I am seeing a lot of names like AirAsih.xml
, WanuaTengahIII
, etc., that cannot be resolved to a real file.
Right. Shall we then agree that <ref target="DHARMA_INSPallava00001.xml">Pallava 1</ref>
will be the only acceptable form of encoding a reference to a DHARMA inscription edition?
OK, perfect.
Curiously, I had never been aware of the tehoretical obligatuon to use @n on <ref>
and in blissful unawareness been encoding as per the result of your discussion. I have no objections.
@michaelnmmeyer : can you extract a list of non-compliant names like AirAsih.xml and WanuaTengahIII.xml, so I can fix them? I hope you are aware that in general all file names bearing the string IDENK are not yet FNC-compliant in that they use inscription names rather than numbers, as temporary solution while we are waiting for the IDENK database (idenk.net) to deliver inscription numbers for these items. This will start to happen within the next half year, I hope.
@arlogriffiths
Here is the list of references. It is probably too long to be useful, though.
BadamiCalukya00004-Kopparam-Pulakesin.xml was in my corpus, now corrected.
Thanks. I have converted the above list to a task list with check boxes and started weeding out offending cases listed above.
@ryosukefurui @ekobastiawan @tyassanti @chhomkunthea @salomepichon @chloechollet @wayanjarrah : Please read the above discussion. Then please help make our files compliant to the precise rules for use of <ref>
. Search cases of offending strings using the "Find/Replace in Files" function, choosing the appropriate repository where the cases is suspected to occur. Correct the relevant file and check the item in the list above.
Examples of correct references for the tfc repositories:
<ref target="DHARMA_INSCIK00011.xml">K. 11</ref>
<ref target="DHARMA_INSCIC00017.xml">C. 17</ref>
<ref target="DHARMA_INSIDENKWintangMasB.xml">Wintang Mas B</ref>
Don't hesitate to ask if anything more needs to be explained.
@michaelnmmeyer : Do you know why I can't tick the boxes above?
@ekobastiawan I have no idea. This might require an administratror account.
I have added @ekobastiawan among assignees. Can you try again now, Eko?
If that too fails, we will need to split up the above list and create separate list per repo. But even on tfd-nusantara-epigraphy, does Eko have administrator rights?
I am able to tick and untick. I'm assigned, but not an admin as far as I know. So I guess Eko should fine now that he is assigned.
@arlogriffiths : I still can't tick the boxes
I suspect the problem has to do with other-than-admin-level access to the repo, which Dan does but Eko doesn't have.
Can you look into this, @michaelnmmeyer? Can we do something about it?
Sounds logical. I'm afraid I have no idea how to check my level of access.
@ekobastiawan Make sure you have sign in to your github account
@ekobastiawan I gave you write access to the repo.
@michaelnmmeyer Thanks a lot, I am now able to tick the boxes
Dear all,
I cannot check the boxes in the list above. I think that I am already signed in the Github. Maybe I have not been given access or was at the wrong place. Can you please help?
Best, Kunthea
Dear all,
I've for now made the modifications for the cam corpus. I haven't been able to locate the cases of C0087.xml and C0096.xml, however.
@michaelnmmeyer :
Dear all,
There is also problem in my files (K. 11, K. 56, K. 77, K. 417 and K. 582). Among them, only K. 56 has a <ref target="DHARMA ...> markup. And I don't see the K. 136.xml in the folder "xml-provisional".
Actually, there are files, especially the hospital inscriptions of Jayavarman VII (K. 12, K. 368, K. 375 ...) which contain many markups. They conform to the norm, i.e. without the @n.
Best, Kunthea
I think you may have misunderstood the nature of the list above. It is not a list of files to be opened and checked, but a list of strings to be searched (in your case in tfc-khmer-epigraphy) and to be replaced by the correct string. For example, if you use "Search/Replace in files" to search the string K0379.xml, you will find one occurrence, namely in the file DHARMA_INSCIK00216-S.xml. In that file, you need to replace <ref target="K0379.xml">K. 379</ref>
by <ref target="DHARMA_INSCIK0379.xml">K. 379</ref>
and then tick K0379.xml in the list above. Is it clear now?
Yes, it is. Thank you!
It seems that there is one zero missing in the file name. Should it be "DHARMA_INSCIK00379" instead of "DHARMA_INSCIK0379" ?
Indeed, small typo from my side. Sorry. Please do add that zero.
Well noted with thanks.
Dear all,
FYI, I have corrected the related to K0011 through K1284 in the list above. I hope that they are all fine now.
Best, Kunthea
@chhomkunthea You should now be able to tick boxes.
Thank you very much! Yes, it's done now.
@michaelnmmeyer : could you help us track down C0087.xml and C0096.xml?
https://github.com/erc-dharma/project-documentation/issues/298#issuecomment-2101934409
@arlogriffiths They have been corrected in the meantime.
Thanks.
@michaelnmmeyer : can you tell me where to look for Dk0019.xml and Dk0020.xml?
@ryosukefurui : all remaining items concern tfc-bengalcharters-epigraphy: can you take care of them?
I have just corrected relevant ref in DHARMA_INSBengalCharters00065.xml, and ticked the list. Excuse me for a delayed response.
@arlogriffiths
Dk0019.xml and Dk0020.xml are both in tfb-daksinakosala-epigraphy/texts/DHARMA_INSDaksinaKosala00021.xml
Thanks. So the remaining work for @NatasjaSB. I don't know if she is still following github, and anyhow I assume @danbalogh can easily make the small modifications in her xml files on her behalf.
So @danbalogh, could you take care of this and then close this issue?
I've made the correction in the DaksinaKosala file. Natasja has recently renamed her files at our request, to follow the pattern used in other collections (INSDaksinaKosala instead of INSDk), and I assume that she did not think to check for and update existing references to files when she did that rename. Her repository seems to contain no other obsolete references.
A minor remark I am not sure I made: when referring to inscriptions with
<ref>
(EGD §10.4.6.Referring to inscriptions in the DHARMABase), the use of@n
for indicating a repository is not needed (because all texts share a single namespace), and adding the.xml
extension is also unnecessary.Thus
<ref n="tfa-pallava-epigraphy" target="INSPallava00001.xml">Pallava 1</ref>
can be written
<ref target="INSPallava00001">Pallava 1</ref>
Internally, all variations
DHARMA_INSPallava00001
,DHARMA_INSPallava00001.xml
, andINSPallava00001
,INSPallava00001.xml
are made to point to https://dharmalekha.info/texts/INSPallava00001.