Edelweiss / hgv

Heidelberger Gesamtverzeichnis der griechischen Papyrusurkunden Ägyptens
MIT License
1 stars 0 forks source link

MP3 in DCLP #189

Open samosafuz opened 1 year ago

samosafuz commented 1 year ago

Thanks to the generous co-operation of Gabriel Nocchi Macedo in Liège, I've assembled a comprehensive spreadsheet of TM numbers and their corresponding MP3 numbers. Please script the import of the latter into DCLP: they should be entered as the value for <idno type="MP3">.

https://docs.google.com/spreadsheets/d/1nilykad8usfuYMSmz9Yk_1ESo6qDyUUNrANSv6SqSiM/edit?usp=sharing

A few notes:

  1. Please import MP3 numbers in the format 00000.000, as in the spreadsheet. In the future, we will generate hyperlinks to the MP3 catalogue from the value recorded in the XML, and those links will require the numbers in that specific format. The spreadsheet kept trying to change the number format on me, and I had to force it to treat the number as text.
  2. Some 889 DCLP files have MP3 idnos already; please avoid duplication
  3. Where the TM value is NULL, skip (as far as I can tell, these are deprecated MP3 numbers and no one will miss them)
  4. Newer publications (i.e., TM numbers greater than 700000) probably do not yet exist in DCLP. The script should confirm that a file for a given TM number exists before proceeding
samosafuz commented 6 months ago

Ultimately, it will be possible to generate hyperlinks from within papyri.info by concatenating the value of <idno type="MP3"/> and the web address via XSLT. The new MP3 catalogue, thankfully, constructs its URIs as follows:

http://www.cedopalmp3.uliege.be/cdp_MP3_display.aspx?numNot=01411.000

jcowey commented 6 months ago

https://github.com/papyri/idp.data/blob/master/DCLP/60/59087.xml#L16

is an example of where the information is already stored.

samosafuz commented 5 months ago

XSL also updated and now awaiting review by Hugh: https://github.com/papyri/navigator/pull/161

jcowey commented 3 months ago

https://github.com/papyri/idp.data/pull/436

merged with: https://github.com/papyri/idp.data/pull/436/commits/9272560b4bbfdabe6b8338524bf426e786441edb

Edelweiss commented 3 months ago

TM numbers that couldn’t be matched to existing DCLP files (tm,mp3) UPDATED June 2nd

1282,00035.000 3747,01355.210 3969,02001.010 11849,02700.280 13359,02749.020 18713,00399.000 19667,02771.100 21292,02407.000 21481,02767.000 21573,01880.000 25024,02559.200 26556,02445.200 28337,02491.000 36082,02765.000 36085,02766.000 36130,02665.300 36260,00593.200|01735.200 36765,02406.010 38445,02000.010 50187,02001.110 60158,00488.000 60392,01199.000 60393,01196.000 60394,01191.000 60483,01165.200 61568,01322.010 63249,02810.100 63564,02485.000 63616,01984.100 63666,02362.510 63872,02691.330 63891,00252.100 64036,02551.000 64354,02764.500 64552,02979.310 64644,02770.000 65497,02309.300 65654,01984.000 66050 + 66051,02023.230 69015 + 69509,01258.050 70002,03025.100 73986,02043.180 78959,02667.000 79348,02423.501 88810,02850.010 97967,00560.104 112373,01211.001 113384,00579.002 113393,00579.003 128622,02001.120 131381,02700.230 131382,02700.240 131385,02700.270 131397,02916.930 131399,02001.130 131400,02916.950 131401,02916.960 131402,02916.970 131567,02667.450 140502,01461.010 144551,02862.001 144647,02431.010 154372,02043.190 220285,02684.004 321192,02336.210 372036,01426.001 372050,00432.001 372051,01724.120 372052,01632.010 372053,01743.010 372054,01743.020 372055,02257.020 372056,02436.030 372057,01984.201 372058,01972.920 372059,02188.040 372060,02188.050 372061,01465.110 372063,01303.701 372064,01303.910 372065,01305.201 372066,01351.300 372069,02276.001 372154,02433.310 372366,02704.054 372367,02704.055 372368,02916.980 372369,02916.990 372444,02433.320 372458,02704.057 372462,02704.058 381887,00456.020 382577,02110.020 388527,00543.410 388528,00539.220 388529,00537.010 388530,00542.020 388531,00345.110 388532,00346.010 388533,00346.020 388534,00456.001 388535,00456.002 388536,00456.230 388537,00461.010 388538,00537.020 388539,02362.530 388540,02360.210 388541,02360.220 388542,02360.230 388543,02360.240 388544,02340.030 388545,02340.040 388546,02377.010 388547,02342.010 388548,02410.115 388549,02409.010 388551,02410.010 388552,02410.117 388553,02410.118 388554,02410.119 388555,02410.101 388557,02410.103 397837,02357.160 412056,00631.002 412057,00813.020 412058,00848.001 412059,00987.103 489700,02902.001 642004,00912.010 642455,02391.410 642647,02921.001 692573,00892.006 697524,00970.110 697525,01075.010 697527,00411.001 697529,00385.020 697530,00136.430 697531,02463.530 700588,02357.161 702326,02751.120 702420,02640.320 702421,01478.010 702422,01495.111 702423,01495.112 702424,01496.010 702425,00366.001 702426,01432.030 702427,01429.010|02859.030 702440,02410.104 702448,02916.510 702583,02400.011 702596,02528.300 702958,01432.040 702960,00255.005 703232,02750.680 703252,02398.101 703253,02398.102 704364,02951.002 704627,01459.120 704628,02619.120 704629,02257.030 704630,01498.020 704631,01470.010 704632,01433.010 704635,01431.001 704636,01431.140 704637,00364.102 704638,00364.103 704639,00364.104 704640,01284.330 704641,01329.020 704642,01330.010|01330.020 704643,01331.010 704644,00461.120 704645,01486.510 704646,01222.010 704647,00454.101 704651,00454.102 754956,01110.010 827752,02118.001 827766,00965.001 827767,00855.210 827768,02797.701 832132,02436.040 832138,02405.010 832177,02704.001 832190,01485.210 832191,01687.060|02408.110 832192,01823.010 832193,01823.020 832194,01225.120 832195,00100.110 832196,00104.001 832197,00104.110 832198,00104.120 832199,00106.010 832200,00111.002 832201,00112.010 832202,00112.020 832203,00112.110 832204,00112.120 832205,00112.130 832206,02916.540 832312,01180.101|01200.010 851595,02336.200 873045,02423.640 874402,02043.210 901286,00148.020 901289,02845.474 901291,01988.010 901292,01190.210 901293,01226.010 901294,02753.120 901298,02916.801 901299,02916.802 901300,02916.803 942990,00331.010 942991,01570.010 942992,01556.220 942993,02373.020 957489,02641.300 971723,02876.100

jcowey commented 3 months ago

Super. Thanks so much for the list of unmatched ones.

samosafuz commented 3 months ago

I'm not sure this issue can be closed just yet. Following the merge of https://github.com/papyri/idp.data/pull/436, I now count 1505 occurrences of <idno type="MP3"> across 1472 files in DCLP, but the spreadsheet provided by CEDOPAL lists 7882 unique MP3 numbers. Even though 182 of those are unmatched, 283 are NULL, and 443 have no associated TM number, that still leaves a great many unaccounted for.