Closed Loer0022 closed 1 year ago
Update for some issues found in tbl_biblio tbl_biblio update MAL-Ceramic revised.txt
Update to create and correct some connections made in tbl_datasets tbl_datasets update MAL-Ceramic.txt
Update for some issues found in tbl_biblio:
Update for some issues found in tbl_datasets:
Currently the effort has been made to organize and provide a consistent and correct output for the MAL dataset. All MAL bibliography currently contained in SEAD has now (unless specific connections between biblio and datasets were uncertain/unclear in the metadata) been connected to their respective dataset. A significant number of bibliographical records has however not been ingested into the dataset and it does not appear to be any pattern to why these haven’t been included in prior ingestion of data (see separate issue #82 for ingestion of the missing data).
The Ceramic dataset has also been made to reviewed in regard to tbl_biblio to ensure that the bibliography is present and its connections to tbl_dataset is correct. Some duplicate records were found in tbl_biblio where one source had four unique iterations (biblio_id 6555, 6651, 6672 and 6544), the record 6555 should be used in the database because it is the only completely correct record. The other records do not have any connections, use SQL to double check, to any tables and can safely be removed from the database.
All ceramics connections made between tbl_biblio and tbl_datasets are correct, however, there are many records that currently lack connections that are clearly defined in the metadata (readily provided by Mattias). I have added bibliography to a couple datasets but did not have time to go through the entire list. As mentioned, Mattias has the needed metadata to make these corrections and all that is needed to be done is use a query to retrieve data from tbl_datasets where biblio_id IS NULL and belongs to master_set_id 3 to crosscheck and correct it with the metadata.
The Dendro dataset has also been assured where some faults were found relating to an issue reading keys when importing the data. This resulted in that MAL bibliography was shown for VDL dataset, this was detected first when using the query below to see which biblio_id was connected where (this resulted in an inverted relationship where no MAL biblio_id should be connected anywhere whereas the Dendro biblio_id should, a false positive). Some random sites were controlled using SuperSEAD to get a quick overview of the dataset, MAL bibliography was detected here and first believed to be a minor mistake when creating connections. But on further inspection in tbl_datasets controlling both methods and dataset_masters_id it was concluded that a critical error had occurred which had mislabelled a large section of SEAD bibliography. This issue was however easily fixed, see issue #79