philbot9 / youtube-info

Fetch meta information about YouTube videos
ISC License
43 stars 20 forks source link

accept-language #7

Closed aulianov closed 5 years ago

aulianov commented 5 years ago

YouTube comments can be in different languages at the same time. Language is determined by accept-language in request header

philbot9 commented 5 years ago

Thanks for the Pull Request. I'm not sure I understand what the purpose of changing the language and user agent is. Could you explain?

aulianov commented 5 years ago

Ok. For example see https://www.youtube.com/watch?v=Rb0UmrCXxVA

With acceptLanguage='en-US,en;q=0.5' : videoInfo.description = "♫ BUY “The Best of Mozart” (MP3 album) on the Official Halidon Music Store: ..."

With acceptLanguage='fr-CA,fr;q=1' : videoInfo.description = "♫ ACHETEZ “Le Meilleur de Mozart” (MP3 album) sur notre magasin de musique: ..."

With acceptLanguage='ru-RU,ru;q=1' : videoInfo.description = "♫ купить альбом Halidon Store: ..."

Description is displayed in various languages. Language is determined by header accept-language. If I want to get a description in French, I should be able to change the variable acceptLanguage

aulianov commented 5 years ago

image

image

philbot9 commented 5 years ago

Ah I see. Yes, that does make sense for the description.

I have a few change requests before I can merge this.

  1. We have to make sure that all the info can still be extracted when the language is not English. For example, for the commentCount we're looking for the word "Comments" followed by a number (here). If the page is in a different language that will not work. If I recall correctly some of the CSS classes on the website change with the language so it may break some of the CSS selectors used for other fields.

  2. The function signature should be: fetchVideoInfo(videoId[, opts][, callback])

In other words, all of the following should work.

fetchVideoInfo('abc123', (err, videoInfo) => {...})
fetchVideoInfo('abc123', { language: 'fr-CA' }, (err, videoInfo) => {...})

fetchVideoInfo('abc123').then(...).catch(...)
fetchVideoInfo('abc123', { language: 'fr-CA' }).then(...).catch(...)
  1. Regarding the user agent, I'm hesitant to make this user-configurable. The reason is that YouTube will serve different content based on the user agent. For example, on modern browsers it will serve a JavaScript-only Polymer app, which does not currently work with this module. Do you have a specific use case for changing the user agent?

Are you able to make those changes and write the appropriate unit tests?

Thanks

aulianov commented 5 years ago

If you want add [opts] between videoId and callback , then i'm thinking the old version will become inoperable. I add UserAgent for the future experiments and do not define it now. Its value is taken by default. I hope that whoever wants to set the value of the user agent will know what he is doing, this is an optional attribute. My patch does not break the previous versions, it was checked. Sorry, my English very bad.

philbot9 commented 5 years ago

I would like to change the signature in your PR because it doesn't adhere to the node convention where the callback is always the last parameter of a function.

fetchVideoInfo (videoId[, callback][, acceptLanguage][, userAgent])

Adding opts as an optional parameter does not break compatibility if we still support the following.

fetchVideoInfo('abc123', (err, videoInfo) => {...})
fetchVideoInfo('abc123')

That can be done by type-checking the second parameter. If it's a function or undefined, then there are no opts. If it's an object, then there are opts. I like the opts object, because it allows us to easily add more options later and because it allows us to name parameters, rather than relying on their implicit order.

Can you add a unit test that fetches a YouTube video in a different language and demonstrates that all fields are still being extracted correctly?

Regarding the User-Agent, I think we should remove it for now, since it is not needed and may cause issues.

aulianov commented 5 years ago

I agree, [opts] object is the best practice. Unfortunately I can not help with the unit test. I agree, the user-agent attribute does not affect the quality of the application and can be removed.

philbot9 commented 5 years ago

Okay, sounds good.

Would you like to implement the opts parameter?

aulianov commented 5 years ago

see commit https://github.com/philbot9/youtube-info/pull/7/commits/b2a7d5ad0005e503c261f6df5a0efe47cba21b9a with opts parameter

Its correctly work as: fetchVideoInfo('{videoId}', function (err, videoInfo) { ... fetchVideoInfo('{videoId}', {language: 'fr-CA'}, function (err, videoInfo) { ... fetchVideoInfo('{videoId}', {language: 'ru'}, function (err, videoInfo) { ... fetchVideoInfo('{videoId}', {language: 'en'}, function (err, videoInfo) { ...

philbot9 commented 5 years ago

Thank you

aulianov commented 5 years ago

The next problem image

The title in the meta tag does not match the title on the page, if the language is not English I want to make another patch to fix this.