ggeop / Python-ai-assistant

Python AI assistant 🧠
MIT License
939 stars 247 forks source link

open_in_youtube is outdated #112

Open KingofGnome opened 3 years ago

KingofGnome commented 3 years ago

Hi, it seems like the 'open_in_youtube'-Skill is not working (at least for me). So 'jarvis play mozart' will always tell you 'I can't find what do you want in Youtube..'.

I guess yt reworked their searchpage and the old implementation (2 years ago), using bs4 and trying to find the top video via class': 'yt-uix-tile-link' is not working anymore.

I'm already working on a fix for that. However, i wanted to create this issue first, just to make sure its not a bug/problem only on my side.

So it would be nice if someone can actually confirm that 'jarvis play xyz' fails with 'I can't find what do you want in Youtube..'.

If so, i'll publish my fix and create a pullrequest.

Also, it's the first time for me, contributing to another ones project and providing pull requests. So if i miss anything or could improve something, please just let me know.

89Q12 commented 3 years ago

Yeah its probably outdated and I wanted to rework the implementation of the yt search but it would be great if you could do it 👍 I wanted to use the youtubei api, references on how to use the internal youtube api can be found here from the invidious project or here. But you're of course free to do implementation however you want

KingofGnome commented 3 years ago

Yeah, i had a look on the youtube api, we could use that. But its another api-key you need to have.

The "problem" with the current solution (using requests.get) is, that the yt-searchresult page is build with lots of javascript, meaning you wont get the full page without something like selenium. But on the bright side, yt initializes the javascript part with a custom jsonstring, containing every information we need. So without using bs4 u can extract the json from the responsetext with reg_ex = re.search( "var ytInitialData = (.*);<\/script>", page) load it as json json_dict = json.loads(reg_ex.group(1)) and work your way through the dictionary, like contentList = json_dict["contents"]["twoColumnSearchResultsRenderer"]["primaryContents"]["sectionListRenderer"]["contents"] ans further...

It works, but obviously it will break again as soon as yt will change a single character in there :/

How are your thought on dependencys? Thinking about using glom for a better handling on the json 'datastucture'.

Also we could just implement both ways: If you added a yt api-key in the settings, Jarvis will use the api, otherwise the get-request with some custom regex&json-magic (that might break again in the future).

89Q12 commented 3 years ago

Sounds solid but as you mentioned it can break when YouTube changes things. When I think about it why don't we use ytdl?

KingofGnome commented 3 years ago

As i said, it depends if we want to make a yt-api key mandatory. If so, we could also just go with the origin google-api-python-client, no need for any "middleware". After all, for now its just a simple request against /search.

Also, did someone consider music.youtube.com? Mby for another skill like "JARVIS play music mozart" (i don't actually know if theres any difference in the videos/music)? Just stumbled upon https://github.com/sigma67/ytmusicapi

However, lots of possibilities :D I guess for now i'll just upload the simple webscrape-fix, its done anyway and we postpone the yt-api till later.

Akul2010 commented 2 years ago

I really think the simplest way is just using pywhatkit:

import pywhatkit as kit

kit.playonyt("url/keywords")