Closed hcayless closed 6 years ago
@HolgerEssler this needs to be sorted as soon as we can manage. Do you want to have all P.Herc. publications collected under one standardised dclp-hybrid so that they ressemble e.g. p.oxy;12;1234?
If the answer is yes then we have to change
<idno type="dclp-hybrid">P.Herc. 1120</idno>
into
<idno type="dclp-hybrid">p.herc;;1120</idno>
(that assumes no volume)
If the answer is no, then we replace these dclp-hybrid with the relevant "na;;23456" value, that is "na" (viz. no author) followed by the TM number.
Yes, please change
<idno type="dclp-hybrid">P.Herc. 1120</idno>
into
<idno type="dclp-hybrid">p.herc;;1120</idno>
.
I suppose
<idno type="dclp-hybrid">P.Herc. 1043 + 1045</idno>
should then become
<idno type="dclp-hybrid">p.herc;;1043;1045</idno>
and
<idno type="dclp-hybrid">P.Herc. 419, 697, 1634</idno>
should become
<idno type="dclp-hybrid">p.herc;;419;697;1634</idno>
.
Would that be ok?
I would recommend something like:
<idno type="dclp-hybrid">P.Herc. 1043 + 1045</idno>
-> <idno type="dclp-hybrid">p.herc;;1043+1045</idno>
and
<idno type="dclp-hybrid">P.Herc. 419, 697, 1634</idno>
-> <idno type="dclp-hybrid">p.herc;;419,697,1634</idno>
@jcowey can this be done in Heidelberg?
On Thu, Aug 17, 2017 at 10:58 AM Hugh A. Cayless notifications@github.com wrote:
I would recommend something like:
P.Herc. 1043 + 1045 ->p.herc;;1043+1045 and
P.Herc. 419, 697, 1634 ->p.herc;;419,697,1634 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DCLP/dclpxsltbox/issues/317#issuecomment-323116593, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQEdfx_I6noYANOuw16ihBHGNdz9ckKks5sZGMPgaJpZM4O6V6p .
-- -- Tom Elliott, Ph.D. Associate Director for Digital Programs and Senior Research Scholar Institute for the Study of the Ancient World (NYU) http://isaw.nyu.edu/people/staff/tom-elliott
Humanities Commons: @paregorios https://hcommons.org/members/paregorios/ OrcID: 0000-0002-4114-6677 http://orcid.org/0000-0002-4114-6677
Possibly this warrants a new ticket, but a number of DCLP records have other problems in the dclp-hybrid idno, namely characters that cause problems in processing. See the list below:
https://github.com/DCLP/idp.data/tree/master/DCLP/220/219977.xml: o.frangé;;438 https://github.com/DCLP/idp.data/tree/master/DCLP/220/219978.xml: o.frangé;;439 https://github.com/DCLP/idp.data/tree/master/DCLP/221/220283.xml: o.frangé;;745 https://github.com/DCLP/idp.data/tree/master/DCLP/51/50747.xml: o.wångstedt;;80 https://github.com/DCLP/idp.data/tree/master/DCLP/59/58962.xml: p.genève[horssérie];;1 https://github.com/DCLP/idp.data/tree/master/DCLP/60/59648.xml: p.genève[horssérie];;3 https://github.com/DCLP/idp.data/tree/master/DCLP/63/62158.xml: p.genève[horssérie];;6 https://github.com/DCLP/idp.data/tree/master/DCLP/63/62913.xml: p.genève[horssérie];;2 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63053.xml: p.murabba'ât;2;108 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63210.xml: p.murabba'ât;2;109 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63211.xml: p.murabba'ât;2;110 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63212.xml: p.murabba'ât;2;111 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63324.xml: p.murabba'ât;2;112 https://github.com/DCLP/idp.data/tree/master/DCLP/66/65799.xml: p.murabba'ât;2;122 https://github.com/DCLP/idp.data/tree/master/DCLP/70/69159.xml: p.demarée;;5
@jcowey ?
pretty sure I know how to fix this and will do so
o.frangé;;438 => o.frange;;438; as in ddbdp o.wångstedt;;80 => o.wangstedt;;80; will have to be added to collection.rdf p.genève[horssérie];;1 => p.geneve[horsserie];;1; will have to be added to collection.rdf p.murabba'ât;2;108 => p.mur;2;108; as in ddbdp p.demarée;;5 => p.demaree;;5; will have to be added to collection.rdf
is that analysis correct @hcayless ?
I'm not sure how the square brackets will play. We'll have to see.
So have now created a new issue #324, to keep these two separate.
@hcayless would you please check that
<idno type="dclp-hybrid">p.herc;;.+</idno>
is now fine.
There should now be no more
<idno type="dclp-hybrid">P.Herc.
left in
https://github.com/DCLP/idp.data/tree/master/DCLP
I have made a number of commits to make the required corrections.
files that still need repair
./63/62411.xml: P.Herc. 228, 403, 407, 1425, 1581 ./63/62425.xml: P.Herc. 495 ./63/62426.xml: P.Herc. 558 ./63/62476.xml: P.Herc. 1471
change to…
./63/62411.xml: p.herc;;228,403,407,1425,1581 ./63/62425.xml: p.herc;;495 ./63/62426.xml: p.herc;;558 ./63/62476.xml: p.herc;;1471
case-sensitive search for P.Herc.
in xpath tei:idno[@type='dclp-hybrid']
didn’t bring forth any further idnos of the kind
https://github.com/DCLP/idp.data/tree/issue317
(in development and master)
Files can be viewed on github and will be picked up with the next sync.
There are 233 DCLP documents that have broken dclp-hybrid
<idno>
values. A full list can be found at https://gist.github.com/hcayless/0c99cb6af2b27239f397ca854e52e677. They all seem to be P.Herc. docs.This error prevents correct indexing of the documents for search.