HUWA import: manuscripts to JSON

ErwinKomen commented 2 years ago

Part of issue #530

Convert the manuscripts in the HUWA database to JSON

ErwinKomen commented 2 years ago

Below is an example of Huwa inhalt.

Questions:

There is no handschrift with id 0, but the first section has 16 items linked to it. What should be done with the first section?
How do the fields von_bis and bis_f need to be interpreted?
1. It looks like the number before the period is the folio number / page number?
2. But then for Handschrift 3, there are numbers after the period - what does that point to?
There are quite a few records in inhalt that have an opera id 0. That means these are not connected with a sermon (manifestation) within the Huwa database. What should be done with these entries?

ErwinKomen commented 2 years ago

Implementation

Make use of the existing EqualGoldHuwaToJson
1. Read the library info from huwa_passim_library.json
2. Add in the output a section manuscripts
Use ManuscriptHuwaToJson, which uses EqualGoldHuwaToJson, but with import_type set to manu
1. Edit the code in EqualGoldHuwaToJson, in the get_data() method

MennaRempt commented 2 years ago

Questions:

There is no handschrift with id 0, but the first section has 16 items linked to it. What should be done with the first section?

How do the fields von_bis and bis_f need to be interpreted?

'von_bis' should be the page/folio number where the manifestation starts and 'bis_f' the page/filio number that the manifestation ends. (although there are some very strange numbers here, like negative numbers, so we have to ask CW for additional explanation). NB. for the locus, the fields 'von_bis' + 'von_rv' have to be combined to form the start folio/page number, and the fields 'bis_f' + 'bis_rv' have to be combined to form the end page/folio number.

It looks like the number before the period is the folio number / page number?

But then for Handschrift 3, there are numbers after the period - what does that point to?

The numbers after the period could potentially indicate the line number that the manifestation starts, but this does not seem to be compatible with what i see when I scroll down the list in the database... We have to ask for additional explanation here, too.

Handschrift no. 3 is strange anyway, since it has no shelfmark and links to an empty library record (no name), which links to an empty location record (no name, not linked to a country). We'll ask CW what to do with this record, exactly.

There are quite a few records in inhalt that have an opera id 0. That means these are not connected with a sermon (manifestation) within the Huwa database. What should be done with these entries?

These should be imported as sermon manifestations, but not linked to AFs.

ErwinKomen commented 2 years ago

There are quite a few records in inhalt that have an opera id 0. That means these are not connected with a sermon (manifestation) within the Huwa database. What should be done with these entries?

These should be imported as sermon manifestations, but not linked to AFs.

EK: I'm not sure how that works. If opera id is 0, then there is no matching opera record, hence no sermon manifestation specification. So would you like me to just add 'empty' sermon manifestations for those situations?

MennaRempt commented 2 years ago

There are quite a few records in inhalt that have an opera id 0. That means these are not connected with a sermon (manifestation) within the Huwa database. What should be done with these entries?

These should be imported as sermon manifestations, but not linked to AFs.

EK: I'm not sure how that works. If opera id is 0, then there is no matching opera record, hence no sermon manifestation specification. So would you like me to just add 'empty' sermon manifestations for those situations?

No, as sermon manifestations with the information found in the table inhalt (i.e. locus) and in the other tables linked to inhalt:

The document "HUWA-mapping_01.xlsx" (in the wrkgrp folder, data > HUWA database) gives the corresponding PASSIM fields for the HUWA ones.

ErwinKomen commented 2 years ago

(responding to the above): I see now. Sorry for the confusion! Importing manuscripts works a bit differently...

ErwinKomen commented 2 years ago

Linking Sermons to SSG/AF

Every SSG/AF derived from HUWA gets a record in table EqualGoldExternal. This record holds the opera id (as field externalid) as well as the SSG/AF id (as field equal .id)
During the import of a sermon manifestation, the inhalt record might e.g. contain a reference to opera id 143. We should then check whether EqualGoldExternal contains a link to an SSG/AF for this externalid of 143. And then we should make a link from the newly imported sermon manifestation to the corresponding SSG/AF (via table SermonDescrEqual).
Taking opera id 143 as an example, that would mean we are talking about appr. 161 links (from the HUWA-imported sermon manifestation to the SSG/AF belonging to opera id 143)

ErwinKomen commented 2 years ago

Okay, picking this up again...

Added interface to activate downloading HUWA manuscript as JSON (Manuscript list view) (19/sep/2022)
Added HUWA (german) country to Passim Location table country conversion (19/sep/2022)
Added get_locus() to convert the von/bis fields into a Passim locus string (21/sep/2022)
Added getting the author from the inhalt: (21/sep/2022)
1. Look in table inms for the field inhalt to see if it matches there
2. Then take the id field of the found record in inms and look it up in autor_inms to find the autor number
3. The autor number is the id of the 'gold' author table autor
4. One problem: the table huwa_passim_author.json contains some entries where there are two HUWA author id's for one Passim id. But note that there is little to no ambiguity...
Extract information from handschrift about the Codex: (21/sep/2022)
1. Add this information to the Manuscript object, which then assumes that each manuscript only has one codicological unit
2. Codico field support: use field material
3. Codico field extent: use fields fol_pag, folbl, vors_vorne, vors_hinten, col, col_breite, zeilen
4. Codico field format: use fields format, hs_breite, schrift_hoehe, schrift_breite
Sermons get into the list of MsItem (21/sep/2022)
The MsItem list gets into the manuscript entry (21/sep/2022)
One thing to consider: (22/sep/2022)
1. Right now I've been adding ssglinks: [], filling in the actual Passim SSG link id's that can be taken up. But since I'm doing it right now, it means that it will only include those SSGs, that have so far been determined (i.e. from HUWA). There still is a backlog of SSGs to be extracted from the HUWA data.
2. And I think we did it differently for SERMONES. We didn't fill in the ssglinks parameter, but instead filled in information at the signaturesA: those are the signatures of the SGs that point to SSGs that need to be connected with a particular sermon.
  1. See ManuscriptUploadJson, where Manuscript.custom_add() is being called with these data, and then SermonDescr.custom_add() in turn, which then calls custom_set() for the signaturesA part.
  2. But this code does not actually add a SG.
  3. What it does do is: create a link between S and SSG, if an SG (with link to SSG) is already existing
3. Okay, I simply put the opera-to-signature code into a function get_opera_signatures(), and I've now added signaturesA to the output per sermon

Results

Statistics of transforming the HUWA manuscripts (with sermons) into JSON: Item	Count
Manuscripts	8444
Sermons	57961

ErwinKomen commented 2 years ago

So, in principle, all of the above is working. Closing this issue now...

ErwinKomen commented 2 years ago

Okay, from issue #534, there is one thing that should be done at the level of making a Manuscript JSON:

Opera with only one link to inhalt, should not be linked to an AF

That is to say, when there is:

If record from opera is referred to only once from inhalt (e.g. 11670) - this means that there is only one handschrift with this opera
Then the JSON output for this sermon should indicate that no SG nor link to SSG may be created

Resolution

Add field in the S "manu_count": nn
That way the issue #534 can take appropriate actions, depending on manu_count being 1 or higher than 1.

MennaRempt commented 2 years ago

Answers from CW to the questions above.

The fields for the folio numbers (von_bis, bis_f, von_rv and bis_rv) sometimes have strange contents, such as negative numbers and decimals. How should this be interpreted?

decimals (to be exact: the thousends) indicate the relative position of a work in a range, e.g. 14r-25r serm. 1,2,3,4 is represented by 14.001 r 25 r sermo 1, 14.002 r 25 r sermo 2, ...
negative numbers refer to the VD ("Vorderdeckel": nr. -115-111; please treat them as identic, I do not know the difference any more); -110 to the first folia that are numbered separately (I); thus, -110 refers to fol. I, -109 to fol. II, ...
I had used negative numbers to make sure that these entries are listed before the normal content (on ff. 1r, ...);
similarily folia numbers 10000 refer to the last pages that are numbered separately (I...)

There is no handschrift with ID 0 in the list of manuscripts, but it does exist in the table inhalt, with 16 linked items. Do you know what manuscript this can be, or is this a HUWA-internal test?

these data seem to be errors, doublettes without indication of a manuscript => delete them

Handschrift ID 3 has no shelfmark and links to an empty library record (no name), which links to an empty location record (no name, not linked to a country). It has a note "nicht löschen". What should we do with this during the import?

this empty manuscript was created to have a full set for the request (otherwise MYSQL created an error because the WHERE clause used a not existing entry); if you do not need it (what I think), please delete table entries like this (it should be in every table)

ErwinKomen commented 2 years ago

Processing the CW responses from Sep/29. Note that this is in reader/views.py EqualGoldHuwaToJson method get_data()

Implementation

Ignore (do not import) handschrift with id 0
1. Added if handschrift_id == 0: continue line in the Handschrift loop
Sermon ordering within inhalt
1. Use the thousands after the floating point for the order of the sermon within the manuscript
  1. That means I need to order lst_inhalt in python on: von_bis
  2. Added: lst_inhalt = sorted(lst_inhalt, key=lambda x: x['von_bis'])
2. Negative number treatment (in von_bis and in bis_f):
  1. Convert -115 ... -111 into -110
  2. Sort according to the numerical content
  3. Next, change -110, -109, -... etc into: Roman Numeral I, II, III, IV etc. (those refer to the vorderdeckel)
3. Numbers like 10000 and 9999: leave as they are - those need to be evaluated manually (it's only 9 sermons)
Action if a handschrift has bibliothek = 2 and/or it has signatur empty:
1. Right now: if both conditions hold (1 manuscript): skip it
2. Update (3/oct): if bibliothek == 2, don't read. Otherwise: do read (ignore presence/absence of signatur)

ErwinKomen commented 2 years ago

Read it again (5/oct) with latest counts:

date read	manuscripts	sermons
19/sep/2022	8444	57961
5/oct/2022	8444	57928

ErwinKomen commented 2 years ago

Problem: The JSON produced is not completely licit: the last object has a comma after it, where it should not.
Resolution:
1. Double check the last JSON entry in the list and physically remove string-final comma, if it has been placed there inadvertedly

The above works well now.

ErwinKomen commented 2 years ago

Fix needed (see issue #534):

The datasets for sermons should be HUWA_sermons
The datasets for manuscripts should be HUWA_manuscripts

ErwinKomen commented 2 years ago

More information

Leafing through the HUWA db tables (also looking at HUWA-mapping_01.xlsx), I recognize that there are some more pieces of information that might be / should be put into JSON and then imported.

The title of a sermon is in table tit (which links to inhalt; there could be multiple titles per sermon)
The date of a manuscript/codico should be specified in the JSON at the level of the manuscript, where the field date should receive the contents of HUWA table annus (on the basis of handschrift). Note that any characters that are not numeric and not a hyphen should be removed from this field.

Okay, need to be concise here. How is each HUWA field, intended for a Passim Manuscript (including Codico, SermonDescr and what have you) going to be processed?

These are the Passim tables involved:

Manuscript
1. ManuscriptExternal - external URL associated with this manuscript
2. ManuscriptKeyword - keywords associated with this manuscript
3. ProvenanceMan - provenance for a particular manuscript (better: see codico)
4. LitrefMan - literature reference for this particular manuscript (including page numbers)
5. CollectionMan - a collection of manuscripts. (probably not used in HUWA?)
6. Codico
  1. Daterange - start and finish year of a daterange for a codicological unit (incl ref and pages of that ref)
  2. CodicoKeyword - any keywords specific for a particular codicological unit
  3. ProvenanceCod - provenance for this particular codicological unit
  4. OriginCod - any number of origins that need to be associated with this codicological unit
  5. MsItem (just for order and hierarchy)
    1. SermonDescr
      1. invalid: Range - a range of bible references for a particular sermon
      2. BibRange - a range of bible references for a particular sermon (difference with previous?)
        
        BibVerse - one or more verses from the bibrange
      3. SermonDescrKeyword - any keywords belonging to a particular sermon
      4. CollectionSerm - collection(s) in which a sermon is.

ErwinKomen commented 2 years ago

Parent	Model	HUWA relevant tables	Status
none	`Manuscript`	`bibliothek`, `cla`, `col_bem`, `fasc`, `ff_bem`, `format_bem`, `handschrift`	partly processed
`Manuscript`	`ManuscriptExternal`	-	-
`Manuscript`	`ManuscriptKeyword`	-	-
`Manuscript`	`ProvenanceMan`	-	-
`Manuscript`	`LitrefMan`	-	-
`Manuscript`	`CollectionMan`	-	-
`Manuscript`	`Codico`	-	-
`Codico`	`Daterange`	start, finish: `annus`	ok
`Codico`	`CodicoKeyword`	-	-
`Codico`	`ProvenanceCod`	`herkunft_besitzer`	Added to JSON output
`Codico`	`OriginCod`	-	-
`Codico`	`MsItem`	-	-
`MsItem`	`SermonDescr`	-	-
`SermonDescr`	`BibRange`	-	-
`SermonDescr`	`SermonDescrKeyword`	-	-
`SermonDescr`	`CollectionSerm`	-	-
`BibRange`	`BibVerse`	-	-

ErwinKomen commented 2 years ago

Unclear what to do with this

Manuscript
1. library: Passim doesn't distinguish long/short name of library, Huwa does. Action?
2. Processing of cla: unclear what to do with it
3. Processing of col_bem: unclear what to do with it
4. Processing of fasc: unclear what to do with this
5. Processing of literatur: this must be done manually + needs input into Zotero
6. Processing of
SermonDescr (or EqualGold)
1. Processing of huwa: no clear instruction what to do with this
2. Processing of identifik: no clear instruction what to do with this
3. Processing of infine: no clear instructions
EqualGold:
1. Processing of nebenwerk

Action list

Manuscript and Codico
1. Add ff_bem.name as 'folia comment: ...' to Codico.extent - done
2. Add format_bem.name as format comment: ...' toCodico.format` - done
3. Add provenance ifnormation from herkunft_besitzer into Codico.provenances - added to JSON output
4. Add hs_notiz information from fields text and bemerkungen to Manuscript.notes - done
5. Add schreiber information to new field Codico.scribe - done
6. Add schrift information to new field Codico.script - done

Action points for JSON-to-DB process

For the process of importing a JSON Manuscript into Passim, some of the matters above lead to action points. This is for issue #534

Process Manuscript.provenances into the appropriate Codico
Process Manuscript.scribeinfo into the appropriate Codico
Process Manuscript.scriptinfo into the appropriate Codico

ErwinKomen commented 2 years ago

Unprocessed HUWA tables

archiv
autor_editionen
bearbeiter
bhl
bhm
bloomfield
collectiones, collinhalt, col_bem - these could be processed separately?
datentraeger
edenda
editionen
fasc
faszikel
handschrift_archiv
hilfe
huwa
indiculum
infine
katalog_inhalt, katalog_name
links
literatur, literatur_archiv
loci
mat_bem
nebenwerk
personen, personen_publikationen, publikationen
reihe, reihealt
retr, rubriken
saec_bem
schoenberger, stegmueller
siglen, siglen_edd
thll
user
verfasser, verfasser_literatur
verkn
verlage
zeilen_bem
zweitsignatur

ErwinKomen commented 2 years ago

Okay, for correct processing one more step is needed:

Instead of simply adding opera_id, add this:
1. A list of external ids: [ { "externalid": (opera_id), "externaltype": "huwop" } ]

But hang on: that step had already been implemented. It's just that no JSON had been produced yet where this popped up. Well, a good thing! But now we have one: passim_huwa_manu_20221121.json

ErwinKomen commented 2 years ago

Some more processing is needed, taken over from issue #596

Processing of M (Manuscripts) and S (Sermon manifestations)

col_bem: separate from collectiones and collinhalt: gives remarks on the number of columns a manuscript has. Add in Cod. Unit: extent.
1. EK: already done
fasc, faszikel: these are both different ways to indicate numbered codicological units. If possible, for items in handschrift that are either connected to fasc or have an entry in the faszikel column, add one codicological unit for each connection, ordered as the number in fasc or faszikel indicated (if that is not possible, with as the cod. Unit name fasc_name or faszikel_name).
1. EK: okay, this is not entirely clear or straightforward (see below), but this will at least add one codicological unit name, and if there is no sermon under it, it will also cause a codicological unit to be created.
infine: manifestation: postscriptum (infine_text)
1. EK: okay, if it is there, it is now added to the JSON
mat_bem: notes on the material of the manuscript: add in codicological unit: support.
1. EK: added into the JSON with support for the manuscript
saec_bem: remarks on the date of manuscripts. Please add this in codicological unit: notes.
1. EK: the JSON now has at the manuscript level: codico_notes
siglen, siglen_edd: give information about which manuscripts were used for critical editions (siglen) and which older editions were used for critical editions (siglen_edd). This information should be added to the manuscripts in the form of a note “Used for [edition reference]”.
1. EK: the siglen information will now be added, it will be imported into Manuscript as part of the raw information. That means this needs to be interpreted correctly for issue #534
2. EK: I now added stuff from siglen_edd to the siglen information, using the editionen identifier as the crucial element to 'bind' them together.
zeilen_bem: remarks on the field ‘zeilen’ in the table handschrift; has to be added in codicological unit: extent following the information in zeilen.
1. EK: added ZeilenBem as parameter to function get_extent(). If there is any note on the 'zeilen', it is now added as (note: this is a note) after the information on 'zeilen' proper.
zweitsignatur: old shelfmarks. Add these in the manuscript details: notes as “Old shelfmark(s): [name]”.
1. EK: this is now processed and added under notes to the manuscript

ErwinKomen commented 2 years ago

Unclear how to process fasc, fasc_name as codicological order number in handschrift 195, since fasc_name just iterates between 0, 2 and 3.

Pragmatic solution: just add the strings fasc_name and/or faszikel_name to the codicological title. This adds element codico_name to the Manuscript JSON object

ErwinKomen commented 1 year ago

Using HUWA stuff inside Passim

I've now added the contents of tables [literatur] and [editionen] into tables in the reader app. The table Literatur (without 'e' at the end) also contains the contents of Bloomfield, Stegmueller, Shoenberger etc. This means that references to editions can now be made to the table reader.Edition

Where are we?

We are now working in reader/views.py, class EqualGoldHuwaToJson, method get_data(). Look for import_type == "manu" within that method. This class is the base class for ManuscriptHuwaToJson, with url name manuscript_huwajson, which is the download Huwa Manuscripts: json that is callable from the Manuscript Listview.

So, remember, once the JSON data has been made in-line with the HUWA data, and downloaded correctly, there is a next step...

And the next step is issue #534, where the Huwa data as added into JSON should be 'read' and processed into even better and more beautiful Manuscript, Codico, MsItem and SermonDescr objects!

Questions (for 2023)

Are the siglen now treated properly?
1. E.g. manuscript lat. 13376 has a whole series of edition id's and most of them have a siglen A<SUP>1</SUP>, but I have no idea whether this is okay or how to verify this

This manuscript has the following list of siglen as linked to editions (siglen at the end):

279: Morin, Germain, CC SL, 333-336, Corpus Christianorum Series Latina, 103 - A<SUP>1</SUP>
298: SChr, 310-324, -, 243 - A<SUP>1</SUP>
299: SChr, 310-324, -, 243 - A<SUP>1</SUP>
247: Morin, Germain, CC SL, 877-881, Corpus Christianorum Series Latina, 104 - A<SUP>1</SUP>
407: 1968, PLS, 397-400, Patrologiae Latinae supllementum, 4, Paris - A<SUP>1</SUP>
408: Verbraken, Pierre-Patrick, 1961, Rev. Bén., 13-17, Revue Bénedictine, 71 - A<SUP>1</SUP>
7457: Caillau, b12-b13, Sancti Aurelii Augustini Sermones inediti (Operum Suppl. 1-3), II - A{h2h}, vlim, maur, verbr
776: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
784: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
785: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
786: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
787: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
803: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
808: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
811: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
812: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
814: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}
295: SChr, 310-324, -, 243 - A{h1h}
825: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}, caes.
826: Goldbacher, Alois, 1911, CSEL, 380-387, Corpus Scriptorum Ecclesiasticorum Latinorum, 57, Wien - A{h1h}, caes. (p. 144-147)
7616: PL, 513-515, Patrologiae Latinae cursus completus, 102 - A{h2h}

Response (to the question above)

The siglen provide a sign for the edition (from table editionen). A siglen item links, on the one hand, to a particular manuscript (called handschrift) and on the other hand to a particular edition whose entry connects with a particular opera (i.e. a passim SermonDescr). So the siglen operate at the sermon rather than at the manuscript level

ErwinKomen commented 1 year ago

Siglen and Editions per sermon

Okay, so right now we've made available the siglen links (as well as siglen_edd) in the downloaded manuscript as a list of combinations between editionen id and the siglen text.

But actually, if we want to know the literature (including editions) for a particular sermon, we need to look at the Huwa table editionen (which is now reader.Edition in Passim) and find all entries for a particular opera number.

So: instead of taking up the siglen information into the manuscript json, it may be better to provide this idiosynchratic siglen information right into the reader.Edition system? That way, we can leave the Manuscript JSON idea 'cleaner'.

Actions

Adapted the code to download huwa_edilit.json and downloaded it with siglen and siglen_edd
Created models in the reader app for Siglen and SiglenEdd
Adapted the one-time adaptations to read siglen + siglen_edd from this huwa_edilit.json and add them into the appropriate places in tables Siglen and SiglenEdd
Adapted the code to download the HUWA manuscripts, so as to exclude the siglen/siglen_edd information (since it is now already at hand in the reader app, from the Edition table).

Implications

As far as issue #534 is concerned (importing the HUWA manuscript JSON into the Passim application), how will the correct editions be showable with this 'new' system, where the edition information is in Passim table Edition (and all linked to it)?

Every sermon (object SermonDescr) comes with an external connection to HUWA's original opera
Since a sermon is part of a Manuscript, it also has access to the external manuscript's connection to HUWA's original handschrift
The list of all editions (references) per sermon can be retrieved by Edition method get_opera_literature(opera_id, handschrift_id)
1. This makes use of table Edition, where the opera_id is used to find all the literature references for that particular sermon (irrespective of manuscript)
The siglen as well as the siglen_edd information is added for each literature reference

ErwinKomen commented 1 year ago

Okay, this then seems to finishe issue #533, and we now turn back again to #534, to see if it all matches...

ErwinKomen commented 1 year ago

Well, ahum, when evaluating the findings, there is one little matter coming up: the libraries. E.g.

Laici is not recognized as such, because the Passim variant has an additional space
Bibliothèque Nationale is not recognized, because Huwa doesn't add "de France"

Didn't we have a library matching Excel? Well, we have lib_huwa_new_EK.xlsx (28/sep/2022), which contains some lines as "In Passim als...X", and lots of lines in quotation marks. We also have huwa_passim_library.json (19/jul/2022), containing sections: 1 - huwaonly 2 - huwapassim: which HUWA id belongs to which PASSIM library id 3 - libraries: all of the libraries with at least passim ID, and where possible the applicable HUWA id

And e.g. for HUWA library 1 (the BNF), there is the corresponding passim id 4814 We actually have issue #567

And later we had a discussion to add some 'new' libraries from Huwa.

But there is one document that did not get the attention it should have, and that is the lib_huwa_new_EK.xlsx. This contains a full list of corrections by Menna for some 212 half-done Huwa libraries, translating them into Passim. I should now process them as part of issue #567...

ErwinKomen commented 1 year ago

Okay, the above is nowworking correctly (the library part).

Other left overs:

Remarks on a manuscript as a whole should be added, I suppose.
The date should be extracted from saeculum

Result: done

ErwinKomen commented 1 year ago

Problems

Authors are not appearing
1. There were two errors (semantic errors) in the code having to do with string/integer indices. I resolved them, and now the cycle works good.
Note too that somehow authors like Pseudo-Augustinus entered into the Author table, while these are not referenced anywhere, and they should not be, since their equivalent is already in the list of authors (Augustinus Hipponensus (pseudo). I don't know, cannot see, how this has come into being. I'm also not sure how to remedy this...

ErwinKomen commented 1 year ago

Add some additional fields from opera, translating them in corresponding SermonDescr ones:

opera.abk: this field doesn't have any specific treatment comments. However, it functions like a sermon title, so let's put it there in that field.
opera.bemerkungen: this field doesn' thave any comments either. But this is really something that should appear in the Note field of a sermon
opera.opera_langname: this field is not processed either, but could be assigned to the subtitle field. In a couple of cases this is just the 'full' version of abk.

ErwinKomen commented 1 year ago

All issues above have been addressed. Perhaps this is enough...

ErwinKomen / RU-passim