eXistSolutions / LGPN

1 stars 0 forks source link

[typesetting] gathered remarks on V6 #292

Closed tuurma closed 3 years ago

tuurma commented 3 years ago

LGPN VI : TYPESETTING ISSUES 22/10/20

tuurma commented 3 years ago

Place order is normally fine. But there are some inexplicable anomalies: Σώσιππος 3 (Apameia), 4 (Emesa), 5 (Apameia).

Anomaly due to out-of-date nested places register, fixed after rebuilding via Admin -> Regenerate list of places

tuurma commented 3 years ago

UP (=unpublished) should appear on the printed page as Unp. It might simplify matters if it was input from the beginning as Unp. and all UP refs converted.

Followed the suggestion and converted all UP into Unp.

tuurma commented 3 years ago

Generation numbers are still a problem. Sometimes omitted altogether, e.g. Σαιανος 1 and 2. Sometimes the numbers are wrong, e.g. Σαπώρης 6 and 8. Sometimes a generation number is added to the wrong person, e.g. Σευῆρος 134, Σιλουανός 121.

I think this is fixed now: there was a combination of reasons - in some cases out-of-date nested persons register, sometimes duplicate counts due to redundancy in paternal/maternal ancestry lines. I have refreshed the register (via Admin -> Regenerate generations index) and covered the edge cases for Σαπώρης and Σιλουανός

NB I can't find any Σευῆρος entries

tuurma commented 3 years ago

Adoptive relationships appear as e.g. s. foster rather than s. (ad.) etc.

Fixed but not sure if this is the expected output? Διογένης 73

image

tuurma commented 3 years ago

For some literary refs the stop and comma following the author abbreviation gets omitted – notably for A., Pers., Phot., Bibl. and Procop., Arc. This is happens where the same abbreviation is authorized with and without the comma. How do we get around this?

I have removed double entries for:

Some duplicates I couldn't decide what to do with:

ID link Abbr Full title
SymMet edit Sym. Met., Symeon Metanoites
SymMet edit +Sym +Met. +Symeon +Metaphrastes

There are some entries which do not share exact identifiers but are clearly or possibly duplicates, e.g.:

ID link Abbr Full title
Ag edit +Ag. +The +Athenian +Agora. +Results +of +Excavations +conducted +by +the +American +School +of +Classical +Studies +at +Athens, 1-- (Princeton, 1953-- )
Ag. edit Ag. +The +Athenian +Agora. +Results +of +Excavations +conducted +by +the +American +School +of +Classical +Studies +at +Athens, (Princeton, 1953--)
BachmanAnec edit Bachman, +Anec.  
BachmannAnec edit Bachmann, +Anec. +Anecdota +Graeca, ed. L. Bachmann (Leipzig, 1828--9)
ScholAR edit Schol. A.R. ed. C. Wendel, +Scholia +in +A.R. +vetera (Berlin, 1935)
ScholAr edit Schol. Ar.,  

Full list can be viewed at: http://clas-lgpn4.classics.ox.ac.uk:8080/exist/apps/lgpn-editor/modules/tools/references.xq

tuurma commented 3 years ago

I also have wondered if it would be possible to convert the many date entries which span two successive years based on an ancient dating formula and appear as e.g. 241-242 AD to a rather more compact 241/2 AD, which is the way such dates would normally be represented. I need to consult with Michael and J-S about this but would like to know if it is feasible.

Yes, possible and should not be too complicated.

tuurma commented 3 years ago

The system does not distinguish names spelled the same but differently accented, e.g. Σῶσις and Σωσίς. This is something of fundamental importance.

Fixed now. I'd appreciate checking if alphabetical ordering is now correct - attaching the recent Σ file. Σ (5).pdf

image

tuurma commented 3 years ago

Sometimes something triggers a repeat of a series of entries. I cannot detect any clear pattern but it occurs where a place heading is followed by one very similar – e.g. Antiocheia-Theoupolis? (territ.) and Antiocheia-Theoupolis?, see Στέφανος 271-278.

I can confirm that the issue persists:

image

Entries with both ? and territ. markers are repeated, while they only should be captured in the antiocheia–theoupolis? (territ.) section


Current state: almost good, just superfluous query marker in ?? (territ.)

image

tuurma commented 3 years ago

n.pr. should appear in brackets. I wonder if it would work better if it was added as text in the Final Bracket field, rather than be indicated in the Type box in the Name field.

I could add brackets around, just for cases where FB doesn't contain anything else I'd need to strip them again. Is it a valid use case though? Just to have n.pr.? See last example below, Σαβαυν.

Current state:

image

image

image

tuurma commented 3 years ago

Multiple indications of profession, office, religion etc. need clearer structure. Sometimes they are separated by a : (e.g. saint: abbot), sometimes no punctuation (e.g. Jew priest). In previous vols. I am fairly sure that a comma has been used to separate multiple professions/offices (e.g. deacon, presbyt.) and a back-slash for office and religion (e.g. priest/Jew). I need to check. In general, a clear structure for the order of the entire contents of the FB is needed, some of which is our responsibility.

We have the system in place for Office/Religion/Profession etc where one assigns a main category and then a subcategory, if applicable. Values for category/subcategory pairs are perhaps not perfectly consistent, e.g. in Religion we have the main category priest with subcategories civic/private/federal and another main category Jew with subcat. presbyter/priest/deacon.... Basic rule for linking category and subcategory is with just a space, no punctuation, so Jew priest and : between multiple entries, so potentially Jew priest: archon (sorry, idiotic example).

I'm happy to streamline this, but we'd need a revised category/subcategory system (and convert the entries which thus require changes).

image

tuurma commented 3 years ago

Information given in the final bracket does not appear when a person figures as a relative. So people with a double name connected by some formula (e.g. ὁ καὶ) lose that information as a relative. This can result in erroneous accentuation appearing in the FB (e.g. Σεανιος 1 where the son appears as Ἰταλός Ταμαλατος instead of Ἰταλὸς ὁ Ταμαλλατος). Even more serious is the loss of the Roman praenomen and nomen which only appear in the FB and the section for Roman names. In all previous vols. the Roman elements of the name are given when the person occurs as a relative in the FB. This may be a problem connected with the way we structured the input form, but could you see if there is any way of getting around it?

If I understand right, the relative name should be using its own FB formulation (from the manual input), if available?

image

I can do this, just concerned that sometimes FB contains additional info which may look odd. We could potentially divide the FB field into two - for the name part and "the rest" to avoid this issue. Also please note that in your example is entered in FB as Ἰτ(α)λὸς Ταμα[λα]τος, also missing the connecting ὁ, also it is not noted in the 'linking' field of either name. We're dealing with a combination of issues, some in my algorithms to determine the name, but some in the data entry.

tuurma commented 3 years ago

(biling.) does not always output on page.

Bit difficult to find such case. Do you happen to have it noted somewhere? I see around 100 cases where the output includes biling. for Σ but harder to find the needle which is not in the haystack ;-)

tuurma commented 3 years ago

occasionally successive refs to the same work repeat the title where it should be dropped. E.g. Σαγιος 10 SEG VII 1232, 5 + SEG XIX 896 which should reduce to SEG VII 1232, 5 + XIX 896. I cannot see any underlying pattern here.

Fixed

image

tuurma commented 3 years ago

there are still problems but now unpredictable where in a string of refs from the same source but interrupted by bracketed refs the primary ref is not repeated. E.g. Σαβῖνος 42 IGLS XIII (1) 9084 (Hackl, Nabataer p. 185 no. F.007.04); 9104; instead of IGLS XIII (1) 9084 (Hackl, Nabataer p. 185 no. F.007.04); IGLS XIII (1) 9104; I cannot see any underlying pattern here.

image

tuurma commented 3 years ago

list of refs lack punctuation ; e.g. Σέργιος 20 – ib. LVI 1903, 6 (mosaic)SEG LXIV 1801, 7

image

tuurma commented 3 years ago

current state

Σ (6).pdf

tuurma commented 3 years ago

Ideally, in relationships the father should precede the mother, complete name should precede fragmentary names.

I couldn't find this exact situation but found a similar one with brothers and sisters mixed, so introduced sorting by gender here

before:

image

after:

image

tuurma commented 3 years ago

Literary refs where author followed by book number, the number should be in lower case Roman numerals, not small caps. E.g. Hdt. viii 57 not Hdt. vii 57

I have assumed that references registered with SMALLROMAN pattern should get this treatment.

image

tuurma commented 3 years ago

Literary refs, book titles by same author – never reduce the book title. This applies especially to Cyr. S. where the titles all begin with V. (for Vita) and the system strips it off where there is a string of refs, e.g. Cyr. S., V. Euthym. 10; 35; V. Sab. 10 which is reduced to Cyr. S., V. Euthym. 10; 35; Sab. 10. See Σαλλούστιος 2.

Can you help me to figure out which references should get this treatment? Cyr. S. is registered with ITALARAB BP1 pattern, is one of these a clue?

RichardLGPN commented 3 years ago

NB I can't find any Σευῆρος entries

I should have written Σεουῆρος under which all spelling variants are listed

RichardLGPN commented 3 years ago

Adoptive relationships appear as e.g. s. foster rather than s. (ad.) etc.

Fixed but not sure if this is the expected output? Διογένης 73

s. (nat.) / d. (nat.) / f. (nat.) / m. (nat.) should come before s. (ad.) / d. (ad.) / f. (ad.) / m. (ad.)

RichardLGPN commented 3 years ago

For some literary refs the stop and comma following the author abbreviation gets omitted – notably for A., Pers., Phot., Bibl. and Procop., Arc. This is happens where the same abbreviation is authorized with and without the comma. How do we get around this?

I have removed double entries for:

I am not sure if this is going to be a problem. For some authors (e.g. Procopius) there are refs to works which lack a title (so Procop. ii 55. 3) and others to a named work (e.g. Procop., +Arc. 14). However, at some point a thorough cleansing of our bibliographical abbreviations is needed.

RichardLGPN commented 3 years ago

The system does not distinguish names spelled the same but differently accented, e.g. Σῶσις and Σωσίς. This is something of fundamental importance.

Fixed now. I'd appreciate checking if alphabetical ordering is now correct - attaching the recent Σ file.

Checked and correct.

RichardLGPN commented 3 years ago

n.pr. should appear in brackets. I wonder if it would work better if it was added as text in the Final Bracket field, rather than be indicated in the Type box in the Name field.

I could add brackets around, just for cases where FB doesn't contain anything else I'd need to strip them again. Is it a valid use case though? Just to have n.pr.? See last example below, Σαβαυν.

Current state:

more information available in the FB section (e.g. Συδδηνος, Σαθηφελας):

The three examples you show look correct to me.

RichardLGPN commented 3 years ago

Multiple indications of profession, office, religion etc. need clearer structure. Sometimes they are separated by a : (e.g. saint: abbot), sometimes no punctuation (e.g. Jew priest). In previous vols. I am fairly sure that a comma has been used to separate multiple professions/offices (e.g. deacon, presbyt.) and a back-slash for office and religion (e.g. priest/Jew). I need to check. In general, a clear structure for the order of the entire contents of the FB is needed, some of which is our responsibility.

We have the system in place for Office/Religion/Profession etc where one assigns a main category and then a subcategory, if applicable. Vas for category/subcategory pairs are perhaps not perfectly consistent, e.g. in Religion we have the main category priest with subcategories civic/private/federal and another main category Jew with subcat. presbyter/priest/deacon.... Basic rule for linking category and subcategory is with just a space, no punctuation, so Jew priest and : between multiple entries, so potentially Jew priest: archon (sorry, idiotic example).

I'm happy to streamline this, but we'd need a revised category/subcategory system (and convert the entries which thus require changes).

I have checked in the published volumes, and find that all multiple professions, statuses, etc. are punctuated with a backslash - so Jew/presbyt. martyr/saint and so on.

RichardLGPN commented 3 years ago

(biling.) does not always output on page.

Bit difficult to find such case. Do you happen to have it noted somewhere? I see around 100 cases where the output includes biling. for Σ but harder to find the needle which is not in the haystack ;-)

In most cases the absence of (biling.) is an omission on our part. However, where it is attached to the second of two linked references, it gets dropped -- e.g. Ναιμιας 1.

RichardLGPN commented 3 years ago

Literary refs where author followed by book number, the number should be in lower case Roman numerals, not small caps. E.g. Hdt. viii 57 not Hdt. vii 57

I have assumed that references registered with SMALLROMAN pattern should get this treatment.

Correct.

tuurma commented 3 years ago

Διογένης 73

s. (nat.) / d. (nat.) / f. (nat.) / m. (nat.) should come before s. (ad.) / d. (ad.) / f. (ad.) / m. (ad.)

image

tuurma commented 3 years ago

In most cases the absence of (biling.) is an omission on our part. However, where it is attached to the second of two linked references, it gets dropped -- e.g. Ναιμιας 1.

Fixed

image

tuurma commented 3 years ago

Subsequent dates:

image

RichardLGPN commented 3 years ago

Dates sample looks fine.


From: Magdalena Turska notifications@github.com Sent: 28 January 2021 20:19 To: eXistSolutions/LGPN LGPN@noreply.github.com Cc: Richard Catling richard.catling@classics.ox.ac.uk; Comment comment@noreply.github.com Subject: Re: [eXistSolutions/LGPN] [typesetting] gathered remarks on V6 (#292)

Subsequent dates:

[image]https://user-images.githubusercontent.com/449468/106193919-6f090d00-61ae-11eb-9289-e9d2f0d1ea52.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/eXistSolutions/LGPN/issues/292#issuecomment-769357644, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEM3RZRN56JRHNRASRR4DLTS4HBEZANCNFSM4TXBTCUA.

RichardLGPN commented 3 years ago

Is there a way of tracking entries where no date outputs because an exact date is categorized as a period date or a period date as an exact date or any other such mismatch?

Richard


From: Magdalena Turska notifications@github.com Sent: 28 January 2021 20:19 To: eXistSolutions/LGPN LGPN@noreply.github.com Cc: Richard Catling richard.catling@classics.ox.ac.uk; Comment comment@noreply.github.com Subject: Re: [eXistSolutions/LGPN] [typesetting] gathered remarks on V6 (#292)

Subsequent dates:

[image]https://user-images.githubusercontent.com/449468/106193919-6f090d00-61ae-11eb-9289-e9d2f0d1ea52.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/eXistSolutions/LGPN/issues/292#issuecomment-769357644, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEM3RZRN56JRHNRASRR4DLTS4HBEZANCNFSM4TXBTCUA.

tuurma commented 3 years ago

extracted remaining notes into separate issues, closing this one

RichardLGPN commented 3 years ago

I have noted some problems with the reduction of dates we requested from the pattern e.g. 241-242 to 241/2.

Where a date crosses a decade it produces an erroneous result. So 249-250 is reduced to 249/0, when the desired result is of course 249/50. The same principle would of course apply when the date straddles a century, e.g. 299-300 reduces to 299/0 instead of 299/300.

Can these cases be managed by your script?


From: Magdalena Turska notifications@github.com Sent: 02 February 2021 09:54 To: eXistSolutions/LGPN LGPN@noreply.github.com Cc: Richard Catling richard.catling@classics.ox.ac.uk; Comment comment@noreply.github.com Subject: Re: [eXistSolutions/LGPN] [typesetting] gathered remarks on V6 (#292)

extracted remaining notes into separate issues, closing this one

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/eXistSolutions/LGPN/issues/292#issuecomment-771514167, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEM3RZRMQSQCWKEHOJXBC23S47DVXANCNFSM4TXBTCUA.