Kakulukian / youtube-transcript

Fetch transcript from a youtube video
308 stars 61 forks source link

Add proxy support #25

Closed ericmmartin closed 4 months ago

ericmmartin commented 6 months ago

The main goal of this PR is to add proxy support to this library. I tested that it worked with HTTP and HTTPS proxies, and authentication using proxies from webshare.io.

Other changes:

jonahsol commented 6 months ago

I think it's great that this library is dependency-free.

Could the proxy behaviour instead be supported by allowing dependency injection of the fetch functions? Here's the idea:

export interface TranscriptConfig {
  lang?: string;
  videoFetch?: async ({ lang, url }: { lang?: string, url: string }) => string
  
  transcriptFetch?: async ({ lang, url }: { lang?: string, url: string }) => string
}

  

public static async fetchTranscript(
    videoId: string,
    config?: TranscriptConfig
  ): Promise<TranscriptResponse[]> {
  
   // Use the dependency injected fetch if provided, or just use the default 
   const videoFetch = config?.videoFetch || defaultFetch

   const transcriptFetch = config?.transcriptFetch || defaultFetch


   // ...rest of the function
}



// This is the fetch currently used
async function defaultFetch({ lang, url }: { lang?: string, url: string }) {
  return fetch(url, {
      headers: {
        ...(lang && { 'Accept-Language': lang }),
        'User-Agent': USER_AGENT,
      },
    })

}



// Export this so that callers can use it in their custom fetch function if desired
export const USER_AGENT =
  'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36,gzip(gfe)';
ericmmartin commented 4 months ago

I'm closing this in favor of something similar to what @jonahsol suggested, based on the following article: https://www.zenrows.com/blog/node-fetch-proxy