jdepoix / youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!
MIT License
2.54k stars 278 forks source link

Youtube Video Upload Date Included #297

Closed SDA-Service closed 21 hours ago

SDA-Service commented 6 days ago

Is your feature request related to a problem? Please describe. When dealing with videos that may discuss the same topic or application. It would be useful to know the upload dates for the videos so that when using Ai applications to fetch the transcripts, it is able to know what the most current and accurate version of the information is.

Describe the solution you'd like Maybe implement the upload date before the start of the transcript text, or another field entirely.

Describe alternatives you've considered The current way would be to manually do it or have the Ai use web searching abilities, but for some reason, when even using models like Open Ai Gpt-4o, which does have the web searching feature. It never relays the correct upload date, even if u tell it to retry over and over again it adjusts to random dates, never getting the accurate one.

Additional context It could possibly be under a new function call say of "fetch_transcript_and_details". May be possible to work with the current Youtube Data API as well, not sure.

Thank you!

SDA-Service commented 6 days ago

This is my current added feature for this using the Youtube Data API v3, if u still want to implement the feature idea into your own api and bypass the Youtube Data API that would be great!


import requests
import logging

def fetch_video_details(video_id):
    # Replace 'YOUR_API_KEY_HERE' with your actual API key in your local environment
    url = f"https://www.googleapis.com/youtube/v3/videos?part=snippet&id={video_id}&key=YOUR_API_KEY_HERE"
    response = requests.get(url)
    if response.status_code == 200:
        video_info = response.json()
        if 'items' in video_info and len(video_info['items']) > 0:
            upload_date = video_info['items'][0]['snippet']['publishedAt']
            return upload_date
    logging.error(f"Failed to fetch video details for {video_id}")
    return None

``` `
jdepoix commented 21 hours ago

Hi @SDA-Service, thank you for your suggestion!

Please have a look at this comment, which explains why I would prefer not to add such features to this module.

Thank you for sharing your solution using the YT Data API though!