talmobi / yt-search

95 stars 30 forks source link

Query search against a channel #82

Open souvikinator opened 1 month ago

souvikinator commented 1 month ago

First of all thank you for such an amazing package.

This kind of a feature request: Searching videos / running a query against a specific channel.

Example:

const videos = await yts( { channelId: 'channel_id', query: "dummy query" } );

Expected output is it returns videos against the query from the provided channel only.

Would love to do further discussions on this. What can be the possible complexities.

Thank you

talmobi commented 1 month ago

@souvikinator it's possible but there's a few difficulties in this case -- here are my thoughts:

1) it would require more than one request -- take for example https://www.youtube.com/@PewDiePie/videos -- the initial page load data we get includes a ytInitialData object but it only has about 30 videos in it -- and a new request for probably only 30 more -- this is clunky if a channel has 100+ videos

2) currently this library does not use a headless browser crawler (like puppeteer) because it's a lot of overhead and a lot more complicated (so it's not possible to get dynamically loaded data like through scrolling etc)

3) currently this library is designed to present data as a normal user would see it -- it sees YouTube the same way a normal user browsing YouTube sees it -- this is intentional and very simple -- it's not a very good deep crawler just like it would be difficult for a normal user to go through hundreds of videos by manually browsing YouTube

That being said -- it is possible -- but preferably in a way that would not require a headless browser like Puppeteer -- using puppeteer we could in theory just go to "https://www.youtube.com/@PewDiePie/videos" and scroll down using the puppeteer robot to get additional videos until we get them all loaded on the page -- this already feels like a clunky approach -- it should however be possible to tap into the client side YouTubes code and use the same api to make multiple requests -- however this would be a slow process as it would need to make many requests to get all videos

4) YouTube's search api that is available on normal YouTube actually has many filters -- maybe using those this would be achievable in a single request and then to create a method in this library that uses such a pre-filtered request

5) for your use case it looks like it would be best to use YouTube's official Data API -- I forget how much it costs but it isn't free

I'm happy to consider any PR's but it's almost better if you'd create a custom script to do what you want to do