DDMAL / linkedmusic-datalake

To create mapping strategies for various music databases into our data lake
https://virtuoso.staging.simssa.ca
0 stars 4 forks source link

The integrity of the downloaded MusicBrainz database needs to be inspected. #185

Closed candlecao closed 1 month ago

candlecao commented 2 months ago

...to ensure that all data is complete, accurate, and consistent. With contrast to https://musicbrainz.org/doc/MusicBrainz_Database: (1)Some entities are seemingly not included in our downloaded CSV files, such as Mediums, CD Stubs... (2)Some properties are not included (listed as below) for corresponding entities:

Areas: aliases, type...
Artists: sort name, aliases, begin and end dates... Events: aliases, begin and end dates...

for example, without the date info, the currently existing "time" property just stays meaningless.

Genres: aliases... Instruments: aliases, disambiguation comment Labels: aliases... Places: aliases, code... ...

candlecao commented 2 months ago

@Yueqiao12Zhang Hi, Yueqiao, is there a reason why certain properties or entities are not included(omitted)? Or is it documented for such exclusion?

candlecao commented 1 month ago

I have inspected the whole. It turns out some are missing, compared with the attributes shown on the corresponding webpages . Maybe we can reach out to MusicBrainz team for clarification.