Closed elespdn closed 1 year ago
Images are on nas SCANLETT (> Gustave Roud > E_scans > scansComplets)
An alternative would be the procedure followed to import the mapping: https://github.com/LaDHUL/oeuvres-roud/blob/master/mapping/import-mapping.rest
Decided: create XML files as described here https://docs.dasch.swiss/2022.06.02/DSP-TOOLS/dsp-tools-xmlupload/?h=bulk
Fixed in https://github.com/LaDHUL/oeuvres-roud/pull/64/commits/aafd82116fdc13e31dbe9f8e84151c33b00f50e1
Fixed
Other errors, while importing articles.
These errors were due to bug in parsing (split on '.'), now fixed.
The input data file cannot be uploaded due to the following validation error(s):
Line 14: Element '{https://dasch.swiss/schema}integer': 'Beau-Site' is not a valid value of the atomic type 'xs:integer'.
ERROR The input data file did not pass validation.
The input data file cannot be uploaded due to the following validation error(s):
Line 14: Element '{https://dasch.swiss/schema}integer': 'Deux fragments d'un hommage à C' is not a valid value of the atomic type 'xs:integer'.
ERROR The input data file did not pass validation.
The input data file cannot be uploaded due to the following validation error(s):
Line 14: Element '{https://dasch.swiss/schema}integer': 'Fragment d une réponse à C' is not a valid value of the atomic type 'xs:integer'.
ERROR The input data file did not pass validation.
The input data file cannot be uploaded due to the following validation error(s):
Line 14: Element '{https://dasch.swiss/schema}integer': 'La Fondation C' is not a valid value of the atomic type 'xs:integer'.
ERROR The input data file did not pass validation.
The input data file cannot be uploaded due to the following validation error(s):
Line 14: Element '{https://dasch.swiss/schema}integer': 'Les Élégies romaines de Goethe traduites par J' is not a valid value of the atomic type 'xs:integer'.
ERROR The input data file did not pass validation.
The input data file cannot be uploaded due to the following validation error(s):
Line 14: Element '{https://dasch.swiss/schema}integer': 'Sur le Diégo de C' is not a valid value of the atomic type 'xs:integer'.
ERROR The input data file did not pass validation.
The input data file cannot be uploaded due to the following validation error(s):
Line 14: Element '{https://dasch.swiss/schema}integer': 'Joie' is not a valid value of the atomic type 'xs:integer'.
ERROR The input data file did not pass validation.
The input data file cannot be uploaded due to the following validation error(s):
Line 14: Element '{https://dasch.swiss/schema}integer': 'Expositions' is not a valid value of the atomic type 'xs:integer'.
ERROR The input data file did not pass validation.
Corrected manually, was a copy of 3
The input data file cannot be uploaded due to the following validation error(s):
Line 14: Element '{https://dasch.swiss/schema}integer': '3a' is not a valid value of the atomic type 'xs:integer'.
ERROR The input data file did not pass validation.
Fixed, was error in link to existing pub
The input data file is syntactically correct and passed validation.
Uploaded file /mnt/scanlettMounted/GustaveRoud/E_Scan/Scans_complets/Publications/pub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01/pub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01_p2_1.tif
ERROR while trying to create resource 'pub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01_p2_1' (pub_RoudGustave_Dunecertainepureté_La_Guilde_du_Livre_1940-01_p2_1.tif).
The mapping of internal IDs to IRIs was written to id2iri_importpub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01_p2_1_mapping_20220818-225841.json
Could not upload the following resources: ['pub_RoudGustave_Dunecertainepureté_La_Guilde_du_Livre_1940-01_p2_1.tif']
The input data file is syntactically correct and passed validation.
Uploaded file /mnt/scanlettMounted/GustaveRoud/E_Scan/Scans_complets/Publications/pub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01/pub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01_p3_2.tif
ERROR while trying to create resource 'pub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01_p3_2' (pub_RoudGustave_Dunecertainepureté_La_Guilde_du_Livre_194001_p3_2.tif).
The mapping of internal IDs to IRIs was written to id2iri_importpub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01_p3_2_mapping_20220819-232649.json
Could not upload the following resources: ['pub_RoudGustave_Dunecertainepureté_La_Guilde_du_Livre_194001_p3_2.tif']
The input data file is syntactically correct and passed validation.
Uploaded file /mnt/scanlettMounted/GustaveRoud/E_Scan/Scans_complets/Publications/pub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01/pub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01_p4_3.tif
ERROR while trying to create resource 'pub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01_p4_3' (pub_RoudGustave_Dunecertainepureté_La_Guilde_du_Livre_194001_p4_3.tif).
The mapping of internal IDs to IRIs was written to id2iri_importpub_Roud Gustave_D'une certaine pureté_La_Guilde_du_Livre_1940-01_p4_3_mapping_20220819-233649.json
Could not upload the following resources: ['pub_RoudGustave_Dunecertainepureté_La_Guilde_du_Livre_194001_p4_3.tif']
More errors while importing articles
Script to create XML check the correspondance between the label of the resource already in the db and the path of the file to be imported, which should include the label of the resource it belongs to. For example, it
pub_Roud Gustave_Présences à Port-des-Prés_La_Guilde_du_Livre_1943-09/pub_Roud Gustave_Présences à Port-des-Prés_La_Guilde_du_Livre_1943-09_p145_2.tif
pub_Roud Gustave_Présences à Port-des-Prés_La_Guilde_du_Livre_1943-09
and looks for a resource in the db with the same label, andhttp://rdfh.ch/0112/c0gp0elQTdmw9c_cE26eJw
The name of the file to be imported were created manually, by copy pasting the label from the db, while digitizing the document.
All sorts of cases in which no correspondence is found.
Typo -> fixed in https://github.com/LaDHUL/oeuvres-roud/pull/64/commits/e5371c0f2554251e34bc2a444bb77cd7e922348e
pub_Roud Gustave_Sur les Châteaux en enfance de Catherine Colomb_La_Guilde_du_Livre_1945-08_p129_2
pub_Roud Gustave_Sur les Châteaux en enfance de Catherine Collomb_La_Guilde_du_Livre_1945-08
MISSING LINK
Label and file names (apostrophes, trailing spaces, quotation marks before and after, double white spaces, squared brackets) -> fixed in b691ce6, 24c5fef, 0fe37fc, d31bafe, 1cb9cb0
pub_Roud Gustave_D'un cahier d'instants_1312 Organe de l’Association romande du personnel de la librairie et de l’édition_1947-12_p15_2
pub_Roud Gustave_D'un cahier d'instants_13 12_1947-12
MISSING LINK
pub_Roud Gustave_Hommage de Gustave Roud_13 12_1968-12_p7_2
pub_Roud Gustave_Hommage de Gustave Roud_13 12_1968-12
MISSING LINK
pub_Roud Gustave_[Ouvre les yeux ferme les yeux]_La_Guilde_du_Livre_1950-12_p264_1
pub_Roud Gustave_[Ouvre les yeux ferme les yeux]_La_Guilde_du_Livre_1950-12
MISSING LINK
pub_Roud Gustave_Le Secret des Compagnons, par Henri Pourrat _Suisse_romande_1938-02_p253_1
pub_Roud Gustave_Le Secret des Compagnons, par Henri Pourrat _Suisse_romande_1938-02
MISSING LINK
pub_Hölderlin Friedrich_Grèce Âges de la vie_Lettres françaises_1967-05_p13_5
pub_Hölderlin Friedrich_Grèce Âges de la vie_Lettres françaises_1967-05
MISSING LINK
Book sections instead of articles, so no match found in the csv (export of db only for articles) -> fixed in https://github.com/LaDHUL/oeuvres-roud/pull/64/commits/9e2e230a8a6d0764af5c34f18955d775d76aef1d
pub_Roud Gustave_Mémoire_Inno-Reflets_1967_p9_2
pub_Roud Gustave_Mémoire_Inno-Reflets_1967
MISSING LINK
pub_Roud Gustave_Appel d'hiver_Poésie 1, La poésie française de Suisse_1973-05_p105_4
pub_Roud Gustave_Appel d'hiver_Poésie 1, La poésie française de Suisse_1973-05
MISSING LINK
? No match found even if there is a resource with the same label ('find' in file works) -> fixed manually https://github.com/LaDHUL/oeuvres-roud/pull/64/commits/5db1ed2b7cfe5ef41a4d8579ef644f281811b6fd
pub_Roud Gustave_Appel d'hiver_Poésie 1, La poésie française de Suisse_1973-05_p105_4
pub_Roud Gustave_Appel d'hiver_Poésie 1, La poésie française de Suisse_1973-05
MISSING LINK
pub_Huttinger Edouard_La Peinture hollandaise_La_Guilde_du_Livre_1957-1_p34_3
pub_Huttinger Edouard_La Peinture hollandaise_La_Guilde_du_Livre_1957-1
MISSING LINK
pub_Roud Gustave_Le Secret des Compagnons, par Henri Pourrat _Suisse_romande_1938-02_p253_1
pub_Roud Gustave_Le Secret des Compagnons, par Henri Pourrat _Suisse_romande_1938-02
MISSING LINK
Bulk import documentation: https://docs.dasch.swiss/2022.06.02/DSP-API/03-apis/api-v1/adding-resources/?h=bulk+import#bulk-import
The procedure described in the up to date doc is the same that we've done before.
Images were imported from XML files at https://github.com/LaDHUL/oeuvres-roud/tree/master/bulkimport/OUTPUT_xml/import_images.
The XML files contain all infos to create resources:
To import the XML, various possibilities from what I see: