nichobi / sponsorblockcast

A shell script that skips sponsored YouTube content on all local Chromecasts
GNU General Public License v3.0
348 stars 16 forks source link

YouTube API Search query not very reliable #35

Open tayl opened 1 year ago

tayl commented 1 year ago

I've been using sponsorblockcast a bit lately on a Chromecast and noticed that many videos I watch wouldn't skip sponsor segments. Thinking the videos might not have any sponsorblock data available, I'd go to my computer, watch the same video, and see that the sponsorblock segments do exist, and are skipping just fine.

After debugging a bit, I noticed that get_videoID_by_API will often set video_id to null. Understanding that the search based on title and artist alone is not 1:1 as we don't have an ID, this is sort of expected. However, it was strange, as searching for the video title and artist on YouTube would pull back the correct video as the first result, nearly always.

After playing around a bit, I changed the YouTube API query to use q=($video_artist)+($video_title) instead of q=\"$video_artist\"+intitle:\"$video_title\". It seems that the "intitle" operator is a little too strict in many cases, and returns nothing. The general search with artist and title alone seems to leverage YouTube's search algorithm and in my experience return the correct result more often.

If you're having these issues, I'd recommend trying this change to see if it improves your outcome. I'm not sure if it's better in general. It should be possible to unit test this if someone has lots of quota to spare :) Take a random sample of video titles, and hit the API both ways. See which is more accurate.

tayl commented 1 year ago

A week later and I'm indifferent about this; it helps in some cases and doesn't in others. Unfortunately I think there's no great way to achieve what we need using the YouTube search. It's too dependent on the magic behind the algorithm. The results vary search to search; it's inconsistent. Something that doesn't return accurate results (or any results) at runtime might work a few minutes later when re-run, or run manually.

The real solution obviously is figuring out why the Chromecast traffic doesn't include a videoId when coming from certain products. Might be worth reporting as a bug to Google.. maybe that's something to bring up on the go-chromecast issue tracker.