Closed tillgrallert closed 3 years ago
IMPORTANT NOTE: when cleaning the authority file and resolving duplicates, do not delete any IDs. Instead, unified entities should gather all IDs of the former duplicates. This is necessary because the master file links to these IDs and the links will be broken when IDs are deleted.
fantastic @tillgrallert I hope that you & family are safe in Beirut. I will turn to this at EST night. Question: if I clean authority file is the master automatically repopulated?
Dear @Mestyan, yes, we are safe. Mainly because I travelled to Beirut alone ;-) Concerning your actual question, the master file has to be actively updated from the authority file.
Great! So I guess the best is to clean the authority file first?
On Nov 7, 2020, at 7:29 AM, Till Grallert notifications@github.com wrote:
Dear @Mestyan https://github.com/Mestyan, yes, we are safe. Mainly because I travelled to Beirut alone ;-) Concerning your actual question, the master file has to be actively updated from the authority file.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ProjectJaraid/jaraid_source/issues/86#issuecomment-723440282, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFTHW4VO62K7RZJMPMFXCTDSOU4SRANCNFSM4TNRFAOA.
The good thing is one can rerun the scripts at any one point. Therefore, there is no need to clean the authority file first. We can work incrementally instead.
I fixed point 3 in the above issue description.
@tillgrallert I merged to master and I also merged my little additions of holdings from the last two days in gh-pages. There was a conflict but I checked and edited. Please fetch - just to be up-to-date! So I work solely on corrections in the master file (on gh-pages). I will also test the CETEIcean transformation and the JS needed to represent the new Arabic data. Very exciting!
Couple of questions:
Dear Adam,
@xml:id
can be manually changed and shouldn't break anything. They are automatically generated and apparently the code to do so isn't fool proof.<idno type="jaraid">
children).<orgName>
s have already been Arabized. What still needs to be done is linking the master file to the authority file. So you can edit the <orgList>
in the authority file like everything else. Please take not that quite a few <person>
elements are indeed organisations and I have marked some of them with comments (<!-- ... -->
). I will try and add the code to point from <orgName>
s in the master file to the authority file over the coming days.With the persName I follow our principle with the titles and if the original is an English/French/Italian etc in Latin characters then I just correct all of its versions into Latin characters, and correct the xml:lang attributes as well, OK? @tillgrallert
However I did one and it disappeared from master. Possibly I should not correct that xml:lang ? @tillgrallert
I stop working and rather wait your answer not to cause a huge problem again
Dear Adam,
Don't worry. Everything works as intended. If there is no <persName xml:lang="ar">
for a <person>
in the authority file, the XSLT cannot add it to the master file. Let's look at the example you provided:
<person xml:id="person_26.d5e406">
<persName xml:id="persName_45.d5e408" xml:lang="ar-Latn-x-ijmes">Victor Barruland</persName>
<persName change="#d2e154" xml:id="persName_6112.d5e30980" xml:lang="ar">v برلند</persName>
<tei:persName change="#d5e194" corresp="#persName_6112.d5e30980" type="flattened" xml:lang="ar">vبرلند</tei:persName>
<persName change="#d5e126" corresp="#persName_45.d5e408" type="flattened" xml:id="persName_1.d6097e1" xml:lang="en">VictorBarruland</persName>
<idno type="jaraid">26</idno>
</person>
<persName type="flattened">
nodes. These are computationally generated and their only purpose is to aid the computational look-up of names.@xml:lang
from "ar-Latn-x-ijmes" to "fr" on the first <persName>
is absolutely correct.@xml:lang
, such as "fr-Arab-AR" (meaning common rendering of French into Arabic).The result would then look like (example)
<person xml:id="person_26.d5e406">
<persName xml:id="persName_45.d5e408" xml:lang="ar-Latn-x-ijmes">Victor Barruland</persName>
<persName change="#d2e154" xml:id="persName_6112.d5e30980" xml:lang="fr-Arab-AR">فكتور بريلان</persName>
<!-- ... -->
<idno type="jaraid">26</idno>
</person>
In order to display some for non-Arabic names in the Arabic columns of the table, we could make the decision to use the original Latin-script name as a fallback option.
Hm, this is an interesting problem. In the case of titles in column 10 we decided to keep mixing languages because the idea is that the original product, ie. the journal itself, had various scripts in its header and we wanted to reproduce it. In this case, we do not really have an idea about whether the French/Italian/British/Syrian editors actually had their names in Latin and Arabic (and Hebrew) scripts, and if so, how they actually transcribed their names from one language to the other. Unless we re-check each and every case; and what we will find is that they did not confirm to the IJMES transliteration to Latin script or the proper transcription to Arabic script, of course. In these cases, we will have unconventional names. I would suggest that we use the original-script name in Latin. What do you think?
I am fine with this fallback to the original Latin script name.
I looked into the XSLT which generates the action to generate a new master from the authority file to create the fallback option but it is over my knowledge. Can you do it please? So that column 11 would use the "en", "fr", "it" or whatever is in column 6 if there is no "ar" there? @tillgrallert ?
This has been done
There are multiple issues with names in the authority file that will require fixing.
Some
<persName>
nodes contain multiple names separated by/
. These need to be split into individual<persName>
children of the containing<person>
element.DONE Quite a few
<persName>
s do not carry an@xml:lang
attribute and have, therefore, not been translated into Arabic script.@xml:lang
needs to be added first and then we re-run the translation script.