DDMAL / linkedmusic-datalake

To create mapping strategies for various music databases into our data lake
https://virtuoso.staging.simssa.ca
0 stars 4 forks source link

There is no archive for the manually reconciled entries for MusicBrainz #194

Closed candlecao closed 1 month ago

candlecao commented 2 months ago

This issue follows #186 Especially those "sub-types" for the entities are more or less reconciled manually. For example, e.g.: type of area:

type | type_uri -- | -- city | https://www.wikidata.org/wiki/Q515 country | https://www.wikidata.org/wiki/Q6256 county | https://www.wikidata.org/wiki/Q28575 district | https://www.wikidata.org/wiki/Q149621 Indigenous territory / reserve |   island | https://www.wikidata.org/wiki/Q23442 mahakuma | https://www.wikidata.org/wiki/Q15637757 military base | https://www.wikidata.org/wiki/Q245016 municipality | https://www.wikidata.org/wiki/Q15284
candlecao commented 2 months ago

The archive should be put in linkedmusic-datalake/musicbrainz/data/reconciledEntries/archiveForManuallyReconciledEntries.xlsx

dchiller commented 1 month ago

Does OpenRefine not have an output for this?

candlecao commented 1 month ago

Does OpenRefine not have an output for this?

I believe OpenRefine can output this. But what I'm suggesting is that we store those manually reconciled records. This way, in the future, when updating the database, we don't need to manually reconcile them again.

dchiller commented 1 month ago

I don't think I totally understand the use case. Is the point that we have reconciled the various entities listed above (city, district, etc.) in a particular dataset with Wikidata, we have done this reconciliation manually, and we want to be able to repeat that reconciliation with updated data/new datasets, etc?

How do we currently support reconciliation of updated data?

candlecao commented 1 month ago

https://github.com/DDMAL/linkedmusic-datalake/tree/main/ArchiveForReconciledEntries Please check this "archive"--as a particular dataset you mentioned.

Is the point that we have reconciled the various entities listed above (city, district, etc.) in a particular dataset with Wikidata, we have done this reconciliation manually, --Yes.

and we want to be able to repeat that reconciliation with updated data/new datasets --I don't quite understand this sentence. Anyway, we have this archive so that next time we won't need to manually reconcile again.

How do we currently support reconciliation of updated data? --This is an issue to be solved or in further discussion: https://github.com/DDMAL/linkedmusic-datalake/issues/144