Closed SirLich closed 2 years ago
I also think that's a great idea. It would be even better if we could create our own way of storing dislikes (in a database) to give all users of the addon the ability to dislike and see the dislikes of a video. This could be similar to SponsorBlock where users are able to submit video segments to a database.
Already being done.
Collecting dislikes by users of extension is viable as soon as google approves application (so that users can log in into extension)
Do you have any insight on how many videos you've scraped? And how many you plan to scrape by the 13th? How are you even going about scraping it?
I assume there are billions of videos? The YouTube search API is literally broken (i.e. returns skewed results, partial results, and broken pagination - even for very specific searches). I imagine their rate limits are also a big issue for this?
500k fresh videos scraped (i.e. our dislike count is not older than a couple of days), and 1.5 billion of historic results (i.e. some counts are from 1 year ago).
Youtube has over 30 billion videos. But most of these videos never get over 1k views.
So we need at least a couple billion videos scraped, and hopefully with freshes dislike rates.
Huh, you've been scraping YouTube for over a year? But nice, 1.5 billion sounds pretty healthy.
I assume prior to this, you were doing something else with all that data? May I ask what exactly? It would be great to get a way to accurately search YouTube again, it didn't suck many years ago. Sometimes I want to find niche things and I have to get out my own search tool and really fiddle hard with publishedBefore
/publishedAfter
/order=date
it really isn't an enjoyable experience so I rarely actually look for things on YouTube.
Huh, you've been scraping YouTube for over a year?
Not me, but that DB is publicly available, so we're incorporating it.
I assume prior to this, you were doing something else with all that data?
I think that project was scraping annotations and subtitles, for search-by-subtitle project.
Makes sense, thanks for the info.
Given the API will be closed by middle of December, what do you think about beginning to crawl youtube to build up a database of historical dislikes?