derenrich / borked-bot

wikidata bot doing lots of disparate things
GNU General Public License v3.0
8 stars 2 forks source link

Add published in qualifier #1

Open lectrician1 opened 3 years ago

lectrician1 commented 3 years ago

I document discographies on Wikidata and was wondering if your bot could add P1433 as a qualifier when adding data about YouTube videos.

It should do this because many times, digital music tracks on YouTube Music are republished as different Youtube videos (YouTube Music songs) in many releases and there should only be 1 item per track, so many videos (songs) get put under 1 track item when they are republished.

Example. This track was published in a 2 Singles and an EP. Because YouTube makes new videos for every track on the release (even though they are the same track), this has multiple videos.

I added the "published in" property to help distinguish which videos were part of which releases and was wondering if your bot could do this.

It could do this by using the YouTube API to search for the playlists (YouTube Music releases) this video ID is in and then checking if Wikidata has any items with that playlist ID. I always add the YouTube playlist ID to releases, so it should be able to find it. Then, it should add that item as a qualifier statement to the video with "published in" as it's property, just as I did.

derenrich commented 3 years ago

This seems reasonable assuming I can do this all through the YouTube APIs (I'm unwilling to scrape). Looking at the API I'm not immediately seeing what params I would pass to do what you're suggesting: https://developers.google.com/youtube/v3/docs/search/list

There's some issues though because a video could be in lots of playlists and I cannot check wikidata for all of them (and who knows what to do if there's multiple matches).

How about if the item appears to be music related (based on P31) then I look for related playlists and try to check the first few of those?

lectrician1 commented 3 years ago

This seems reasonable assuming I can do this all through the YouTube APIs (I'm unwilling to scrape). Looking at the API I'm not immediately seeing what params I would pass to do what you're suggesting: https://developers.google.com/youtube/v3/docs/search/list

I guessed you could do it through the API, but it seems not!

What you could do is check if there's a YouTube playlist ID (P4300) in the albums the track is published in (P1433) and then check if that playlist has the video ID of the original item. In other words, get each of the published in (P1433) statement items on the track item, check those items to see if they have a YouTube playlist ID, get the PlaylistItems for each playlist, and compare to see if any of the playlists contain the video of the same ID.

derenrich commented 3 years ago

What you could do is check if there's a YouTube playlist ID (P4300) in the albums the track is published in (P1433) and then check if that playlist has the video ID of the original item.

yeah something like this crossed my mind. i think it would be a totally different bot action run separately though. but then I'm unsure what's the value of these qualifiers if the item is already tagged as published in that album?

lectrician1 commented 3 years ago

Well the qualifier IMO should just be added as an indicator so that editors know what videos are part of what albums. You're right that the item for the qualifier is basically the same item that the above release. I can see how this might be unnecessary then. Maybe just do it for tracks that have more than 1 Youtube Video ID and published in statement?

derenrich commented 3 years ago

Yeah I'll consider this