DCLP / dclpxsltbox

Sandbox for development, testing, and review of XSLT for DCLP
http://dclp.github.io/dclpxsltbox/
1 stars 5 forks source link

Malformed dclp-hybrid values: more #324

Closed jcowey closed 6 years ago

jcowey commented 6 years ago

New ticket, but a number of DCLP records have other problems in the dclp-hybrid idno, namely characters that cause problems in processing. See the list below:

https://github.com/DCLP/idp.data/tree/master/DCLP/220/219977.xml: o.frangé;;438 https://github.com/DCLP/idp.data/tree/master/DCLP/220/219978.xml: o.frangé;;439 https://github.com/DCLP/idp.data/tree/master/DCLP/221/220283.xml: o.frangé;;745 https://github.com/DCLP/idp.data/tree/master/DCLP/51/50747.xml: o.wångstedt;;80 https://github.com/DCLP/idp.data/tree/master/DCLP/59/58962.xml: p.genève[horssérie];;1 https://github.com/DCLP/idp.data/tree/master/DCLP/60/59648.xml: p.genève[horssérie];;3 https://github.com/DCLP/idp.data/tree/master/DCLP/63/62158.xml: p.genève[horssérie];;6 https://github.com/DCLP/idp.data/tree/master/DCLP/63/62913.xml: p.genève[horssérie];;2 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63053.xml: p.murabba'ât;2;108 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63210.xml: p.murabba'ât;2;109 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63211.xml: p.murabba'ât;2;110 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63212.xml: p.murabba'ât;2;111 https://github.com/DCLP/idp.data/tree/master/DCLP/64/63324.xml: p.murabba'ât;2;112 https://github.com/DCLP/idp.data/tree/master/DCLP/66/65799.xml: p.murabba'ât;2;122 https://github.com/DCLP/idp.data/tree/master/DCLP/70/69159.xml: p.demarée;;5

jcowey commented 6 years ago

pretty sure I know how to fix this and will do so

o.frangé;;438 => o.frange;;438; as in ddbdp o.wångstedt;;80 => o.wangstedt;;80; will have to be added to collection.rdf p.genève[horssérie];;1 => p.geneve[horsserie];;1; will have to be added to collection.rdf p.murabba'ât;2;108 => p.mur;2;108; as in ddbdp p.demarée;;5 => p.demaree;;5; will have to be added to collection.rdf

is that analysis correct @hcayless ?

Hugh wrote: "I'm not sure how the square brackets will play. We'll have to see."

jcowey commented 6 years ago

@hcayless would you please check that above mentioned idnos are no longer troublesome. None of these should now be left in https://github.com/DCLP/idp.data/tree/master/DCLP

I have made a number of commits to make the required corrections.

Edelweiss commented 6 years ago

DCLP idnos that contain special characters, such as

p.herc;;19,698 p.herc;;220,1078,1080,1669,1693 p.herc;;221,245,463,1423 p.herc;;222,223,1082,1089,1643,1675 p.herc;;224,1007,1114 p.herc;;225,411,1094,1497,1572,1575,1578 p.herc;;228,403,407,1425,1581 p.herc;;229,242,243,247,248,433,437,1077,1088,1098,1428,1609,1610,1648,1788 p.herc;;233,860 p.herc;;239,310,1787 p.herc;;240,435,455,467,468,1095,1096,1099,1101,1426,1633,1646 p.herc;;250,398,426,1427,1601,1619 p.herc;;253,465,1090,1613 p.herc;;253,1025 p.herc;;255,418,1091,1112 p.herc;;336,1150 p.herc;;397,399,817 p.herc;;408,409,1117,1573,1672 p.herc;;419,697,1634 p.herc;;425,1079,1086,1580,1674 p.herc;;444,460,466,1073,1074,1081 p.herc;;832,1015 p.herc;;0908,1390 p.herc;;993,1149 p.herc;;994,1419,1676,1677 p.herc;;1056,1420 p.herc;;1479,1417 p.herc;;1577,1579

p.herc;;21_7

p.herc;;1043+1045 p.herc;;1258+1822 p.herc;;1413+1416 p.herc;;1605+1606

Edelweiss commented 6 years ago

https://github.com/DCLP/idp.data/commit/5f24f325a18bb1df34a71af52e650ee013a52804 https://github.com/DCLP/idp.data/commit/9ca1a605c617000ee999683fcff5c62de3c14f70 https://github.com/DCLP/idp.data/commit/e82a65aab5531692a55e736a42d1a37c438a177e https://github.com/DCLP/idp.data/commit/1a1a3a7b7e7661248c8b926b49068c8b3df29ce8 https://github.com/DCLP/idp.data/commit/83f213a53292e7fe42d73eb895ec95a7374ff616