alexmercerind / youtube-search-python

🔎 Search for YouTube videos, channels & playlists. Get 🎞 video & 📑 playlist info using link. Get search suggestions. WITHOUT YouTube Data API v3.
MIT License
742 stars 164 forks source link

Frequently asked questions for youtube-search-python #133

Open mytja opened 3 years ago

mytja commented 3 years ago

BEFORE OPENING NEW ISSUE

Please make sure you have latest release of youtube-search-python from Git:

pip install git+https://github.com/alexmercerind/youtube-search-python

1. JSON is a proper format

JSON is used by everyone. It is one of best and simplest ways to store data.

2. Don't always trust JSON validators

JSON validators aren't always 100% accurate. Things such as \n and non-ASCII characters might disrupt them, so they might show errors...

3. How to traverse JSON

To get a specific key or piece of information from our methods, you have to traverse a dictionary. For example, if you want to get ID from first search result, you use this:

1>> from youtubesearchpython import VideosSearch
2>> videosSearch = VideosSearch('NoCopyrightSounds', limit = 1)
3>> result = videosSearch.result()
4>> result1 = result["result"]
5>> resultdict = result1[0]
6>> id = result["id"]

(ignore numbers at beginning of line and >>, they are just kept here, because we made this tutorial using Python console)

Let me explain what this does. First 3 lines, just retrieve result from YouTube. Our result is currently like this:

{
    "result": [
        {
            "type": "video",
            "id": "K4DyBUG242c",
            "title": "Cartoon - On & On (feat. Daniel Levi) [NCS Release]",
            "publishedTime": "5 years ago",
            "duration": "3:28",
            "viewCount": {
                "text": "389,673,774 views",
                "short": "389M views"
            },
            "thumbnails": [
                {
                    "url": "https://i.ytimg.com/vi/K4DyBUG242c/hqdefault.jpg?sqp=-oaymwEjCOADEI4CSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLBkTusCwcZQlmVAaRQ5rH-mvBuA1g",
                    "width": 480,
                    "height": 270
                }
            ],
            "richThumbnail": {
                "url": "https://i.ytimg.com/an_webp/K4DyBUG242c/mqdefault_6s.webp?du=3000&sqp=COCn64IG&rs=AOn4CLBeYxeJ_5lME4jXbFQlv7kIN37kmw",
                "width": 320,
                "height": 180
            },
            "descriptionSnippet": [
                {
                    "text": "NCS: Music Without Limitations NCS Spotify: http://spoti.fi/NCS Free Download / Stream: http://ncs.io/onandon \u25bd Connect with\u00a0..."
                }
            ],
            "channel": {
                "name": "NoCopyrightSounds",
                "id": "UC_aEa8K-EOJ3D6gOs7HcyNg",
                "thumbnails": [
                    {
                        "url": "https://yt3.ggpht.com/a-/AOh14GhS0G5FwV8rMhVCUWSDp36vWEvnNs5Vl97Zww=s68-c-k-c0x00ffffff-no-rj-mo",
                        "width": 68,
                        "height": 68
                    }
                ],
                "link": "https://www.youtube.com/channel/UC_aEa8K-EOJ3D6gOs7HcyNg"
            },
            "accessibility": {
                "title": "Cartoon - On & On (feat. Daniel Levi) [NCS Release] by NoCopyrightSounds 5 years ago 3 minutes, 28 seconds 389,673,774 views",
                "duration": "3 minutes, 28 seconds"
            },
            "link": "https://www.youtube.com/watch?v=K4DyBUG242c",
            "shelfTitle": null
        }
    ]
}

What we do with 4th line, we move deeper into this structure, more specifically, the "result" key, so what we get out is:

[
    {
        "type": "video",
        "id": "K4DyBUG242c",
        "title": "Cartoon - On & On (feat. Daniel Levi) [NCS Release]",
        "publishedTime": "5 years ago",
        "duration": "3:28",
        "viewCount": {
            "text": "389,673,774 views",
            "short": "389M views"
        },
        "thumbnails": [
            {
                "url": "https://i.ytimg.com/vi/K4DyBUG242c/hqdefault.jpg?sqp=-oaymwEjCOADEI4CSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLBkTusCwcZQlmVAaRQ5rH-mvBuA1g",
                "width": 480,
                "height": 270
            }
        ],
        "richThumbnail": {
            "url": "https://i.ytimg.com/an_webp/K4DyBUG242c/mqdefault_6s.webp?du=3000&sqp=COCn64IG&rs=AOn4CLBeYxeJ_5lME4jXbFQlv7kIN37kmw",
            "width": 320,
            "height": 180
        },
        "descriptionSnippet": [
            {
                "text": "NCS: Music Without Limitations NCS Spotify: http://spoti.fi/NCS Free Download / Stream: http://ncs.io/onandon \u25bd Connect with\u00a0..."
            }
        ],
        "channel": {
            "name": "NoCopyrightSounds",
            "id": "UC_aEa8K-EOJ3D6gOs7HcyNg",
            "thumbnails": [
                {
                    "url": "https://yt3.ggpht.com/a-/AOh14GhS0G5FwV8rMhVCUWSDp36vWEvnNs5Vl97Zww=s68-c-k-c0x00ffffff-no-rj-mo",
                    "width": 68,
                    "height": 68
                }
            ],
            "link": "https://www.youtube.com/channel/UC_aEa8K-EOJ3D6gOs7HcyNg"
        },
        "accessibility": {
            "title": "Cartoon - On & On (feat. Daniel Levi) [NCS Release] by NoCopyrightSounds 5 years ago 3 minutes, 28 seconds 389,673,774 views",
            "duration": "3 minutes, 28 seconds"
        },
        "link": "https://www.youtube.com/watch?v=K4DyBUG242c",
        "shelfTitle": null
    }
]

See, "result" key is gone, we moved inside it. We basically get a Python list, which is a structure that can hold many other structures in specific order or without a specific order. In Python, we start counting on list with 0 integer, so 0 means, first item in this list, 1 means, second item in list, 2 means third item in list, and so on. Fifth line does just that, it retrieves first item from this list. Now we get this:

{
    "type": "video",
    "id": "K4DyBUG242c",
    "title": "Cartoon - On & On (feat. Daniel Levi) [NCS Release]",
    "publishedTime": "5 years ago",
    "duration": "3:28",
    "viewCount": {
        "text": "389,673,774 views",
        "short": "389M views"
    },
    "thumbnails": [
        {
            "url": "https://i.ytimg.com/vi/K4DyBUG242c/hqdefault.jpg?sqp=-oaymwEjCOADEI4CSFryq4qpAxUIARUAAAAAGAElAADIQj0AgKJDeAE=&rs=AOn4CLBkTusCwcZQlmVAaRQ5rH-mvBuA1g",
            "width": 480,
            "height": 270
        }
    ],
    "richThumbnail": {
        "url": "https://i.ytimg.com/an_webp/K4DyBUG242c/mqdefault_6s.webp?du=3000&sqp=COCn64IG&rs=AOn4CLBeYxeJ_5lME4jXbFQlv7kIN37kmw",
        "width": 320,
        "height": 180
    },
    "descriptionSnippet": [
        {
            "text": "NCS: Music Without Limitations NCS Spotify: http://spoti.fi/NCS Free Download / Stream: http://ncs.io/onandon \u25bd Connect with\u00a0..."
        }
    ],
    "channel": {
        "name": "NoCopyrightSounds",
        "id": "UC_aEa8K-EOJ3D6gOs7HcyNg",
        "thumbnails": [
            {
                "url": "https://yt3.ggpht.com/a-/AOh14GhS0G5FwV8rMhVCUWSDp36vWEvnNs5Vl97Zww=s68-c-k-c0x00ffffff-no-rj-mo",
                "width": 68,
                "height": 68
            }
        ],
        "link": "https://www.youtube.com/channel/UC_aEa8K-EOJ3D6gOs7HcyNg"
    },
    "accessibility": {
        "title": "Cartoon - On & On (feat. Daniel Levi) [NCS Release] by NoCopyrightSounds 5 years ago 3 minutes, 28 seconds 389,673,774 views",
        "duration": "3 minutes, 28 seconds"
    },
    "link": "https://www.youtube.com/watch?v=K4DyBUG242c",
    "shelfTitle": null
}

Now we can retrieve ID, using sixth line. If you would print it out, you would get this:

K4DyBUG242c

The fourth, fifth and sixth line can be merged into one line, but were kept that way, because it is simpler to explain that way. If you would merge it, here is what you get:

id = result["result"][0]["id"]

4. OK, now I understand how dictionaries are cool, but I really want CSV instead of dictionary. Is there any way?

I recommend you to Google it, or if you are more privacy oriented, DuckDuckGo it.

5. OK, I understand this. But, what is the difference between JSON and dictionary.

Ah, good question. JSON is basically a dictionary, that is not stored as dictionary, but as string. This means, you can't traverse JSON, but if you use Python's json module, you can convert this JSON to dictionary and other way round. Dictionary is basically traversable (modifiable) JSON, but if you are building a web server in Python, you can't return dictionary directly from endpoint, but you have to dump it to string, and then return it.

6. Great, thanks for explaining. I want to access your scripts through my JavaScript/HTML based website. Is there any chance to do it?

Yes, there is. You can't integrate Python code directly into JavaScript code, but you can create so called, Python backend, using FastAPI, or even a full blown frontend and backend in Flask or Django

7. I always get this error: SyntaxError: 'await' outside function. How can I fix it.

You are importing asynchronous package. Make sure to import synchronous package, if you don't know anything about Asynchronous Python and asyncio. Import this

from youtubesearchpython import *

Instead of

from youtubesearchpython.__future__ import *

8. Why are you using PyTube instead of youtube-dl?

  1. youtube-dl is kinda bloated. It comes with many sites, from which we need only one.
  2. PyTube is much faster in many of our tests

We are migrating to yt-dlp, as it's much faster and more importantly - more stable

9. You have an API key in your code. That is not safe.

This library is not using YouTube Data v3 API and thus is not using API keys. The so called searchKey is a random key present in all of YouTube's requests from frontend to backend (website to server). This key is public and same for everyone and you can see it for yourself, if you open Network tab in your browser

This is it. If you want even more questions answered, leave a comment here, and I will add more questions as time goes on.

10. Does this library have any limit?

Have a look at this comment

younger027 commented 2 years ago

does that have limit for call search api? wait you reply

mytja commented 2 years ago

does that have limit for call search api? wait you reply

This library itself does not contain any limits of API calls. Note that YouTube itself might detect your IP as a bot and will probably IP ban you, but that only lasts approximately 6 hours (at least in most of my cases), afterwards they will unban you.

younger027 commented 2 years ago

ok i see.thanks for you reply

yarko11 commented 2 years ago

Does this library parse all the YT videos with particular ? I mean, can I get all videos data from "https://www.youtube.com/hashtag/" page (~1800 items in my particular case)? Currently thanks to "Hashtag" from the library I can get data of around 473-478 videos only (I use it in sync mode with while hashtag.next() to get all the available videos ). Thank you!

mytja commented 2 years ago

Hello, sorry it took me a while to respond. So, YouTube shows there are ~1800 items in UI, but you can never retrieve more than 500 (at least in my cases). We retrieve as many videos as in UI, because YouTube backend doesn't allow us more. That's because why we cannot fix it. Thank you so much

ghost commented 2 years ago

about video information retrieved, miss "how many likes/dislike" property