Open tuurma opened 4 years ago
Ordering the abbreviations by number of references there are:
I'd suggest to concentrate on the most common abbreviations to figure out what the predominant patterns are.
Initial results for IGLS show that majority of entries matching , (\d)+$)
pattern (ending with , number
) (bit below 3k cases out of ~9k total IGLS references could be automatically converted)
ending with , number
~4k total, single comma ~3k
ending with , number
6.3 total, single comma ~2.9k
ending with , number
~2k total, with comma only about 300 but much more variation, may require some manual checks first
ending with , number
~2k total, not many with comma; check the dot in entries like CIIP I (2) 842.15 Αβιδελλα
ChLA very few with commas
IG with commas majority simple to convert (650 with single 1 comma pattern); some with dots, some with no.
all Meimaris, majority has no.
, e.g. Meimaris, +Chronological +Systems p. 189 no. 103Ιδδος
As a preparatory step I extended our xml template to store the line number explicitly
declare namespace tei="http://www.tei-c.org/ns/1.0";
for $bibl in collection('/db/apps/lgpn-data/data/persons')//tei:bibl[not(@type='volume')][not(tei:note[@type='line'])]
let $add := <note xmlns="http://www.tei-c.org/ns/1.0" type="line"/>
return
update insert $add following $bibl/tei:ref
and adjusted the input form accordingly; please note that the Linking field has been moved up and now is placed in the same row with Line
Many thanks, Magdalena, the three lists look ok to me. Should I be able to see anything by clicking on the links at right? Right now I see only this error:
[cid:A84A17A2-1A1C-4F76-A010-797F0D60670C]
On Apr 22, 2020, at 12:13 PM, Magdalena Turska notifications@github.com<mailto:notifications@github.com> wrote:
@michaelzellmannhttps://github.com/michaelzellmann I have prepared a conversion list, in the first instance tackling just most popular entries with simple cases that just ends with , number pattern. If you could have a glance at the conversion suggestions below if they look reasonable and let me know
IGLShttp://clas-lgpn4.classics.ox.ac.uk:8080/exist/apps/lgpn-editor/modules/tools/biblLines.xq?bibl=IGLS
SEGhttp://clas-lgpn4.classics.ox.ac.uk:8080/exist/apps/lgpn-editor/modules/tools/biblLines.xq?bibl=SEG
IGhttp://clas-lgpn4.classics.ox.ac.uk:8080/exist/apps/lgpn-editor/modules/tools/biblLines.xq?bibl=IG
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/eXistSolutions/LGPN/issues/284#issuecomment-617712818, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE55QHEFFZLNXIPUSXF6A7LRN3GN5ANCNFSM4LEEUEGA.
Thanks, I've fixed the link so it leads to the person input form.
I will run the conversion now for IGLS, SEG and IG and attach the logs here.
After running the conversion other cases containing comma but not matching the pattern of final comma and number
Could you please confirm if following handling is appropriate
[1]
8, 75
LV 1053 A
l. 9
and LV 1053 B
l. 15
IG very few remaining cases like IG XI (4) 772, 3, 15 (same as SEG case 2) and the rest could be handled manually
Please see below for answers between lines
On Apr 22, 2020, at 1:58 PM, Magdalena Turska notifications@github.com<mailto:notifications@github.com> wrote:
After running the conversion other cases containing comma but not matching the pattern of final comma and number
SEGhttp://clas-lgpn4.classics.ox.ac.uk:8080/exist/apps/lgpn-editor/modules/bibl-lines.xq?bibl=SEG
Could you please confirm if following handling is appropriate
Correct
Correct
Correct
Correct, B and D are part of the “details” and not the line number
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/eXistSolutions/LGPN/issues/284#issuecomment-617763284, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE55QHAUGYNFRA2JMIQK66TRN3SXVANCNFSM4LEEUEGA.
As we're slowly converting database entries, I'm now working on the LaTeX generating scripts
Here's a test case for Γέμελλα, in Heliopolis we should have
(2) IGLS vi 2751, 3 (3) ib. l.4
Original bibl. entry for (3) is IGLS vi 2751, 4
Correct, thanks. I am still working through your list of the Yes / Maybe / No entries.
On Apr 24, 2020, at 11:56 AM, Magdalena Turska notifications@github.com<mailto:notifications@github.com> wrote:
As we're slowly converting database entries, I'm now working on the LaTeX generating scripts
Here's a test case for Γέμελλα, in Heliopolis we should have
(2) IGLS vi 2751, 3 (3) ib. l.4
Original bibl. entry for (3) is IGLS vi 2751, 4
[image]https://user-images.githubusercontent.com/449468/80205340-bb755a00-862a-11ea-80e9-205333040d47.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/eXistSolutions/LGPN/issues/284#issuecomment-618943675, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE55QHH772JVE33VSX4Y4BTROFV6JANCNFSM4LEEUEGA.
Yes, I saw you were working in the Google doc, many thanks!
Meanwhile I have some progress with presenting ib with lines but need to test if there are no regressions in other cases
Might be worth checking with Richard but I believe there should be a space after l., i.e. here “ib. l. 4"
Thanks, fixed
Thanks to Michael's list I could convert further entries matching the final comma-number pattern for following abbreviations (log file attached)
"IPalTertia", "ISyrie", "AAES", "ITyr", "IGerasa", "MUSJ", "ZDPV", "IWadi_Haggag", "YCS", "Nessana", "IJO", "Hajjar", "IPalTertia_west", "Dussaud_Macler_Mission", "IMSoueida", "SEMA", "INegev", "Lörincz", "PEQ", "DainIGLouvre", "MFO", "Mouterde_Limes", "BCH", "ILS", "IIasos", "CIJ", "IDR", "Ovadiah_MPI", "Resafa", "FroehnerInscrLouvre", "SBF", "PMasada", "Topoi", "PferdehirtMilitärdiplome", "IGR", "KayserRecueil", "Mittmann_Beiträge", "ISmyrna", "RMD", "Clermont_Ganneau_RAO", "DOP", "IAntMaroc", "BAAL", "IAquil", "RA", "JIWE", "Pall", "Brünnow_Domaszewski_PA", "IEJ", "MendelCat", "CrowfootObjectsfromSamaria", "Old_Syriac_Inscriptions"
Here are counts of entries for each abbreviations that have line filled currently: singlecomma-Michaelslist-log.html.zip
After converting the single comma-number pattern matches for selected abbreviations yesterday, today I've prepared the conversion for patterns where there are multiple comma-separated numbers at the end and/or some numbers are in brackets (cases 1 and 2 as discussed here)
I've run the would-be conversion (generating new values but without applying) for a handful of most common abbreviations biblLines.pdf
Looking at these results, I'd suggest to
"IGLS", "SEG", "CIIP", "IG", "TEAD", "ISyrie", "IMnBeyrouth", "AAES"
"PDura", "PNess", "J"
There are no matches for other most common abbreviations: "ChLA", "RE", "Meimaris_Chronological_Systems", "FRA", "SchiefferACOIndexProsopogr", "DCB", "IPalTertia", "PLRE", "Justi", "IMoab", "PIR2"
Thanks, this looks ok for 1. Definitely not “J” in 2. as that is a literary text, it has no line numbers. PDura and PNess will be mostly long strings with many line numbers separated by commas, which can be done manually if not automated.
On Apr 29, 2020, at 12:19 PM, Magdalena Turska notifications@github.com<mailto:notifications@github.com> wrote:
After converting the single comma-number pattern matches for selected abbreviations yesterday, today I've prepared the conversion for patterns where there are multiple comma-separated numbers at the end and/or some numbers are in brackets (cases 1 and 2 as discussed herehttps://github.com/eXistSolutions/LGPN/issues/284#issuecomment-617763284)
I've run the would-be conversion (generating new values but without applying) for a handful of most common abbreviations biblLines.pdfhttps://github.com/eXistSolutions/LGPN/files/4551310/biblLines.pdf
Looking at these results, I'd suggest to
There are no matches for other most common abbreviations: "ChLA", "RE", "Meimaris_Chronological_Systems", "FRA", "SchiefferACOIndexProsopogr", "DCB", "IPalTertia", "PLRE", "Justi", "IMoab", "PIR2"
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/eXistSolutions/LGPN/issues/284#issuecomment-621138309, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE55QHCUPNS7TCMTPTYBPELRPAEMNANCNFSM4LEEUEGA.
Thanks for super-fast response, I will run it in the evening then (after 6pm in Oxford and after triggering backup, as usual)
I've just ran the conversion for "IGLS", "SEG", "CIIP", "IG", "TEAD", "ISyrie", "IMnBeyrouth", "AAES"
, logs are attached.
Current numbers for entries with line field filled
IGLS XXI.5 29, 1 -> IGLS XXI.5 29, 1 IGLS XXI.5 29, 2 -> ib. l. 2