talmobi / yt-search

98 stars 31 forks source link

Single video metadata gives some incorrect values #83

Open kevin1113-github opened 2 months ago

kevin1113-github commented 2 months ago

Single video metadata gives some incorrect values

Description:

Hi, first of all, thank you for creating and maintaining this valuable library. It’s been a great help in many of my projects.

I encountered an issue while using the yt-search library (v2.11.1) to fetch metadata for a single video using its videoId. Some of the values returned seem to be incorrect or missing.

Below is the TypeScript code snippet and the corresponding output:

TypeScript code snippet:

import yts from "yt-search";
const id = '-ObdvMkCKws';
const result: yts.VideoMetadataResult = await yts({ videoId: id });
console.log(result);

Logs:

{
  title: 'Top 20 Most Popular Songs by NCS | Best of NCS | Most Viewed Songs',
  description: '',
  url: 'https://youtube.com/watch?v=-ObdvMkCKws',
  videoId: '-ObdvMkCKws',
  seconds: 0,
  timestamp: '',
  duration: { toString: [Function: toString], seconds: 0, timestamp: '' },
  views: NaN,
  genre: '',
  uploadDate: '2019-7-15',
  ago: '5 years ago',
  image: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg',
  thumbnail: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg',
  author: { name: 'Music Store', url: 'https://youtube.com/@MusicStoreLabel' }
}

Problem:

The following fields appear to have incorrect or missing data:

I have verified that this issue occurs consistently across multiple video IDs, not just the one listed above. Is this a known issue, or is there a recommended way to handle or work around this problem? Any guidance or advice would be greatly appreciated.

Thank you for your time and support.

kevin1113-github commented 2 months ago

I have some additional information regarding this issue.

When I run the code in my local environment, everything works as expected. However, when I execute the same code on an AWS EC2 instance, the problem occurs.

Below is the log output from my local environment, which shows the correct data:

Logs:

{
  title: 'Top 20 Most Popular Songs by NCS | Best of NCS | Most Viewed Songs',
  description: 'Top NCS full ☞ https://youtu.be/HPhHr6h4Qjc\n' +
    '\n' +
    '\n' +
    '☞ Follow Magic Sound :\n' +
    '  Facebook → https://goo.gl/fWA96C\n' +
    '  Subscribe → https://goo.gl/IAgUeH\n' +
    '  \n' +
    '☞ Follow NCS\n' +
    '  Spotify → http://bit.ly/SpotifyNCS\n' +
    '  Twitter → http://twitter.com/NCSounds\n' +
    '  Google+ → http://google.com/+nocopyrightsounds\n' +
    '  Facebook → http://facebook.com/NoCopyrightSounds\n' +
    '  Soundcloud → http://soundcloud.com/nocopyrightsounds',
  url: 'https://youtube.com/watch?v=-ObdvMkCKws',
  videoId: '-ObdvMkCKws',
  seconds: 4367,
  timestamp: '1:12:47',
  duration: {
    toString: [Function: toString],
    seconds: 4367,
    timestamp: '1:12:47'
  },
  views: 34554101,
  genre: 'music',
  uploadDate: '2019-7-15',
  ago: '5 years ago',
  image: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg',
  thumbnail: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg',
  author: { name: 'Music Store', url: 'https://youtube.com/@MusicStoreLabel' }
}

Given this difference in behavior between environments, I am wondering if this issue could be related to IP blocking by YouTube on AWS instances.

For further context, I recently encountered a similar issue in the same project when using the @distube/ytdl-core@4.14.4 library. In that case, the problem was resolved by using cookies due to YouTube's IP blocking. Could this be a similar issue, and if so, is there a known solution or workaround?

I appreciate your attention to this matter.

talmobi commented 2 months ago

Hello and thank you,Indeed you are correct. It seems like YouTube is in the process of updating/deprecating some stuff particularly to do with ytInitialPlayerResponse data. This may be why it mostly works locally and not on VPS services like AWS or github actions. It is also reserving some info only to logged in users that code/bots that use this library mostly don’t support.I found the exact same issues you mentioned and have been debugging a little bit for a few days now by ssh:ing into the GitHub Actions running instance.I’m not sure how to best solve these yet as it looks like most of the information is no longer provided in the initial http request.In the worst case we may have to deprecate that info altogether or make additional requests or even resort to using a headless browser. Is the missing information vital to your application/code?Sent from a mobile. On 16. Aug 2024, at 13.47, SeungWon Hwang @.> wrote: I have some additional information regarding this issue. When I run the code in my local environment, everything works as expected. However, when I execute the same code on an AWS EC2 instance, the problem occurs. Below is the log output from my local environment, which shows the correct data: Logs: { title: 'Top 20 Most Popular Songs by NCS | Best of NCS | Most Viewed Songs', description: 'Top NCS full ☞ https://youtu.be/HPhHr6h4Qjc\n' + '\n' + '\n' + '☞ Follow Magic Sound :\n' + ' Facebook → https://goo.gl/fWA96C\n' + ' Subscribe → https://goo.gl/IAgUeH\n' + ' \n' + '☞ Follow NCS\n' + ' Spotify → http://bit.ly/SpotifyNCS\n' + ' Twitter → http://twitter.com/NCSounds\n' + ' Google+ → http://google.com/+nocopyrightsounds\n' + ' Facebook → http://facebook.com/NoCopyrightSounds\n' + ' Soundcloud → http://soundcloud.com/nocopyrightsounds', url: 'https://youtube.com/watch?v=-ObdvMkCKws', videoId: '-ObdvMkCKws', seconds: 4367, timestamp: '1:12:47', duration: { toString: [Function: toString], seconds: 4367, timestamp: '1:12:47' }, views: 34554101, genre: 'music', uploadDate: '2019-7-15', ago: '5 years ago', image: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg', thumbnail: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg', author: { name: 'Music Store', url: @.' } } Given this difference in behavior between environments, I am wondering if this issue could be related to IP blocking by YouTube on AWS instances. For further context, I recently encountered a similar issue in the same project when using the @@.*** library. In that case, the problem was resolved by using cookies due to YouTube's IP blocking. Could this be a similar issue, and if so, is there a known solution or workaround? I appreciate your attention to this matter.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

kevin1113-github commented 2 months ago

Hello and thank you,Indeed you are correct. It seems like YouTube is in the process of updating/deprecating some stuff particularly to do with ytInitialPlayerResponse data. This may be why it mostly works locally and not on VPS services like AWS or github actions. It is also reserving some info only to logged in users that code/bots that use this library mostly don’t support.I found the exact same issues you mentioned and have been debugging a little bit for a few days now by ssh:ing into the GitHub Actions running instance.I’m not sure how to best solve these yet as it looks like most of the information is no longer provided in the initial http request.In the worst case we may have to deprecate that info altogether or make additional requests or even resort to using a headless browser. Is the missing information vital to your application/code?Sent from a mobile. On 16. Aug 2024, at 13.47, SeungWon Hwang @.> wrote: I have some additional information regarding this issue. When I run the code in my local environment, everything works as expected. However, when I execute the same code on an AWS EC2 instance, the problem occurs. Below is the log output from my local environment, which shows the correct data: Logs: { title: 'Top 20 Most Popular Songs by NCS | Best of NCS | Most Viewed Songs', description: 'Top NCS full ☞ https://youtu.be/HPhHr6h4Qjc\n' + '\n' + '\n' + '☞ Follow Magic Sound :\n' + ' Facebook → https://goo.gl/fWA96C\n' + ' Subscribe → https://goo.gl/IAgUeH\n' + ' \n' + '☞ Follow NCS\n' + ' Spotify → http://bit.ly/SpotifyNCS\n' + ' Twitter → http://twitter.com/NCSounds\n' + ' Google+ → http://google.com/+nocopyrightsounds\n' + ' Facebook → http://facebook.com/NoCopyrightSounds\n' + ' Soundcloud → http://soundcloud.com/nocopyrightsounds', url: 'https://youtube.com/watch?v=-ObdvMkCKws', videoId: '-ObdvMkCKws', seconds: 4367, timestamp: '1:12:47', duration: { toString: [Function: toString], seconds: 4367, timestamp: '1:12:47' }, views: 34554101, genre: 'music', uploadDate: '2019-7-15', ago: '5 years ago', image: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg', thumbnail: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg', author: { name: 'Music Store', url: @.' } } Given this difference in behavior between environments, I am wondering if this issue could be related to IP blocking by YouTube on AWS instances. For further context, I recently encountered a similar issue in the same project when using the @@. library. In that case, the problem was resolved by using cookies due to YouTube's IP blocking. Could this be a similar issue, and if so, is there a known solution or workaround? I appreciate your attention to this matter. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.>

Thank you for your quick and helpful response!

Unfortunately, I'm currently developing an application similar to a music player using packages like ytdl. To achieve a higher quality application, metadata like seconds, which is no longer provided, is essential.

I've seen that the @distube/ytdl-core@4.14.4 package has resolved similar issues by extracting cookies from a logged-in user's browser session on YouTube and applying those cookies to the same IP. Although I'm not an expert in the HTTP/HTTPS protocol, I hope a similar solution using cookies can be implemented to resolve this issue.

I sincerely hope this problem can be resolved soon. Thank you for your hard work and dedication.

Additionally, if there is anything else I could try, please let me know.

talmobi commented 2 months ago

@kevin1113-github hello -- I'm curious to see if using a headless browser would be a feasible solution -- can I ask how often do you fetch metadata? 1/minute? 10/minute? 100/minute?

thank you~

talmobi commented 2 months ago

as a potential solution I'm thinking of simply making another http request using regular youtube search with the query of the given metadata id -- it should give as the top result the video with that video id and it should contain most of the missing data except genre.

kevin1113-github commented 2 months ago

@kevin1113-github hello -- I'm curious to see if using a headless browser would be a feasible solution -- can I ask how often do you fetch metadata? 1/minute? 10/minute? 100/minute?

thank you~

Hi @notunderctrl, first off, I apologize for the delayed response. To answer your question, it’s hard to specify the exact frequency of metadata fetches since it depends on when users select a particular song. However, for now, even a limit of 1/minute would be acceptable as long as it works reliably.

Also, the following method seems to work correctly and retrieves the metadata without issues:

export async function searchMusic(query: string) {
  const result: yts.SearchResult = await yts(query);
  const returns: yts.VideoSearchResult[] = result.videos.slice(0, 10);
  return returns;
}

Logs from return:

[
  {
    type: 'video',
    videoId: '-ObdvMkCKws',
    url: 'https://youtube.com/watch?v=-ObdvMkCKws',
    title: 'Top 20 Most Popular Songs by NCS | Best of NCS | Most Viewed Songs',
    description: 'Top NCS full ☞ https://youtu.be/HPhHr6h4Qjc ☞ Follow Magic Sound : Facebook → https://goo.gl/fWA96C Subscribe ...',
    image: 'https://i.ytimg.com/vi/-ObdvMkCKws/hq720.jpg',
    thumbnail: 'https://i.ytimg.com/vi/-ObdvMkCKws/hq720.jpg',
    seconds: 4367,
    timestamp: '1:12:47',
    duration: {
      toString: [Function: toString],
      seconds: 4367,
      timestamp: '1:12:47'
    },
    ago: '5 years ago',
    views: 34562493,
    author: {
      name: 'Music Store',
      url: 'https://youtube.com/@MusicStoreLabel'
    }
  },
  {
    type: 'video',
    videoId: '6czTSASgzzY',
    url: 'https://youtube.com/watch?v=6czTSASgzzY',
    title: 'Best of NCS ~ Top 20 Most Popular Songs by NCS ~ NoCopyrightSounds [ 400 VIEWS SPECIAL ] NoCopyright',
    description: 'Enjoy the Best of NCS (NoCopyrightSounds) with my special compilation featuring the top 20 most popular songs by NCS!',
    image: 'https://i.ytimg.com/vi/6czTSASgzzY/hq720.jpg',
    thumbnail: 'https://i.ytimg.com/vi/6czTSASgzzY/hq720.jpg',
    seconds: 4345,
    timestamp: '1:12:25',
    duration: {
      toString: [Function: toString],
      seconds: 4345,
      timestamp: '1:12:25'
    },
    ago: '5 months ago',
    views: 466731,
    author: { name: 'Focus Wave', url: 'https://youtube.com/@FocusWave-nb7ek' }
  },
  {
    type: 'video',
    videoId: 'JNl1_hRwpXE',
    url: 'https://youtube.com/watch?v=JNl1_hRwpXE',
    title: '30 Million Subscriber MIX | NCS - Copyright Free Music',
    description: "WE DID IT!! ❤️ Thank you SO much for all the support you guys have shown us over the past TEN YEARS! This wouldn't have ...",
    image: 'https://i.ytimg.com/vi/JNl1_hRwpXE/hqdefault.jpg',
    thumbnail: 'https://i.ytimg.com/vi/JNl1_hRwpXE/hqdefault.jpg',
    seconds: 5726,
    timestamp: '1:35:26',
    duration: {
      toString: [Function: toString],
      seconds: 5726,
      timestamp: '1:35:26'
    },
    ago: '3 years ago',
    views: 15998334,
    author: {
      name: 'NoCopyrightSounds',
      url: 'https://youtube.com/@NoCopyrightSounds'
    }
  },
  {
    type: 'video',
    videoId: 'HcuZ46iJEok',
    url: 'https://youtube.com/watch?v=HcuZ46iJEok',
    title: 'Gaming Music 2024 ♫ Inspire Music Mix for TryHard, NCS, Electronic, House ♫ Best Of EDM 2024',
    description: 'Welcome to Freeme NCS ☆ Gaming Music 2024 ♫ Inspire Music Mix for TryHard, NCS, Electronic, House ♫ Best Of EDM 2024 ...',
    image: 'https://i.ytimg.com/vi/HcuZ46iJEok/hq720.jpg',
    thumbnail: 'https://i.ytimg.com/vi/HcuZ46iJEok/hq720.jpg',
    seconds: 11309,
    timestamp: '3:08:29',
    duration: {
      toString: [Function: toString],
      seconds: 11309,
      timestamp: '3:08:29'
    },
    ago: '1 month ago',
    views: 76773,
    author: {
      name: 'Freeme NCS Music',
      url: 'https://youtube.com/@FreemeNCSMusic'
    }
  },
  {
    type: 'video',
    videoId: 'OegyYwm6rqE',
    url: 'https://youtube.com/watch?v=OegyYwm6rqE',
    title: '【BGM】Top 50 Most Popular Songs by NCS【EDM】',
    description: 'It is a medley of the top 50 views of NCS! All of them are cool songs, and they have an hour, so please use them for work ...',
    image: 'https://i.ytimg.com/vi/OegyYwm6rqE/hq720.jpg',
    thumbnail: 'https://i.ytimg.com/vi/OegyYwm6rqE/hq720.jpg',
    seconds: 11150,
    timestamp: '3:05:50',
    duration: {
      toString: [Function: toString],
      seconds: 11150,
      timestamp: '3:05:50'
    },
    ago: '2 years ago',
    views: 3907779,
    author: { name: 'るかのサブ部屋', url: 'https://youtube.com/@Ruka8' }
  },
  {
    type: 'video',
    videoId: '4U7mAl10rcM',
    url: 'https://youtube.com/watch?v=4U7mAl10rcM',
    title: 'Awesome Mix For TryHard Gaming 2024 ♫ Top 30 NCS, Gaming Music, Remixes, House ♫ Best Of EDM 2024',
    description: 'Welcome to Freeme NCS ☆ Awesome Mix For TryHard Gaming 2024 ♫ Top 30 NCS, Gaming Music, Remixes, House ♫ Best Of ...',
    image: 'https://i.ytimg.com/vi/4U7mAl10rcM/hq720.jpg',
    thumbnail: 'https://i.ytimg.com/vi/4U7mAl10rcM/hq720.jpg',
    seconds: 11480,
    timestamp: '3:11:20',
    duration: {
      toString: [Function: toString],
      seconds: 11480,
      timestamp: '3:11:20'
    },
    ago: '2 months ago',
    views: 169326,
    author: {
      name: 'Freeme NCS Music',
      url: 'https://youtube.com/@FreemeNCSMusic'
    }
  },
  {
    type: 'video',
    videoId: 'yJg-Y5byMMw',
    url: 'https://youtube.com/watch?v=yJg-Y5byMMw',
    title: 'Warriyo - Mortals (feat. Laura Brehm) | Future Trap | NCS - Copyright Free Music',
    description: '- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - When you are using this track, please add this in your description: Track: Warriyo ...',
    image: 'https://i.ytimg.com/vi/yJg-Y5byMMw/hqdefault.jpg',
    thumbnail: 'https://i.ytimg.com/vi/yJg-Y5byMMw/hqdefault.jpg',
    seconds: 230,
    timestamp: '3:50',
    duration: { toString: [Function: toString], seconds: 230, timestamp: '3:50' },
    ago: '7 years ago',
    views: 271401462,
    author: {
      name: 'NoCopyrightSounds',
      url: 'https://youtube.com/@NoCopyrightSounds'
    }
  },
  {
    type: 'video',
    videoId: '-AFGPGCaMqw',
    url: 'https://youtube.com/watch?v=-AFGPGCaMqw',
    title: 'Wonderful Mix 2024 ♫ Top 30 Songs: Chill Music, Gaming Playlist, NCS ♫ Best Of EDM 2024',
    description: 'Welcome to Freeme NCS ☆ Wonderful Mix 2024 ♫ Top 30 Songs: Chill Music, Gaming Playlist, NCS ♫ Best Of EDM 2024 ...',
    image: 'https://i.ytimg.com/vi/-AFGPGCaMqw/hq720.jpg',
    thumbnail: 'https://i.ytimg.com/vi/-AFGPGCaMqw/hq720.jpg',
    seconds: 11429,
    timestamp: '3:10:29',
    duration: {
      toString: [Function: toString],
      seconds: 11429,
      timestamp: '3:10:29'
    },
    ago: '2 months ago',
    views: 53569,
    author: {
      name: 'Freeme NCS Music',
      url: 'https://youtube.com/@FreemeNCSMusic'
    }
  },
  {
    type: 'video',
    videoId: 'B4EkwueekvI',
    url: 'https://youtube.com/watch?v=B4EkwueekvI',
    title: 'Top 50 Most Popular Songs by NCS | No Copyright Sounds',
    description: '0:00 - 1. Alan Walker - Fade 4:20 - 2. Cartoon - On & On (feat. Daniel Levi) 7:48 - 3. Alan Walker - Spectre 11:33 - 4. DEAF KEV ...',
    image: 'https://i.ytimg.com/vi/B4EkwueekvI/hq720.jpg',
    thumbnail: 'https://i.ytimg.com/vi/B4EkwueekvI/hq720.jpg',
    seconds: 11098,
    timestamp: '3:04:58',
    duration: {
      toString: [Function: toString],
      seconds: 11098,
      timestamp: '3:04:58'
    },
    ago: '2 years ago',
    views: 2538967,
    author: { name: 'Gaming Music', url: 'https://youtube.com/@GamingMusic99' }
  },
  {
    type: 'video',
    videoId: '3nQNiWdeH2Q',
    url: 'https://youtube.com/watch?v=3nQNiWdeH2Q',
    title: 'Janji - Heroes Tonight (feat. Johnning) | Progressive House | NCS - Copyright Free Music',
    description: 'NCS: Music Without Limitations NCS Spotify: http://spoti.fi/NCS Free Download / Stream: http://ncs.io/ht ▽ Connect with NCS ...',
    image: 'https://i.ytimg.com/vi/3nQNiWdeH2Q/hqdefault.jpg',
    thumbnail: 'https://i.ytimg.com/vi/3nQNiWdeH2Q/hqdefault.jpg',
    seconds: 209,
    timestamp: '3:29',
    duration: { toString: [Function: toString], seconds: 209, timestamp: '3:29' },
    ago: '9 years ago',
    views: 338281172,
    author: {
      name: 'NoCopyrightSounds',
      url: 'https://youtube.com/@NoCopyrightSounds'
    }
  }
]

Thank you!

talmobi commented 2 months ago

@kevin1113-github hello please try the new version npm install yt-search@2.12.0 -- genre/category data is unavailable and video description data is limited

kevin1113-github commented 2 months ago

@kevin1113-github hello please try the new version npm install yt-search@2.12.0 -- genre/category data is unavailable and video description data is limited

Hi @notunderctrl, Thank you for update.

I tested version 2.12.0, but it still seems that the data is missing:

{
  title: 'Top 20 Most Popular Songs by NCS | Best of NCS | Most Viewed Songs',
  description: '',
  url: 'https://youtube.com/watch?v=-ObdvMkCKws',
  videoId: '-ObdvMkCKws',
  seconds: 0,
  timestamp: '',
  duration: { toString: [Function: toString], seconds: 0, timestamp: '' },
  views: NaN,
  genre: '',
  uploadDate: '2019-7-15',
  ago: '5 years ago',
  image: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg',
  thumbnail: 'https://i.ytimg.com/vi/-ObdvMkCKws/hqdefault.jpg',
  author: { name: 'Music Store', url: 'https://youtube.com/@MusicStoreLabel' }
}

-- Additional Information --

After further investigation, I noticed that some videos do seem to return metadata correctly, excluding the genre:

{
  title: '(Top 1 Viral) OPM Acoustic Love Songs 2024 Playlist 💗 Best Of Wish 107.5 Song Playlist 2024 #v9',
  description: '(Top 1 Viral) OPM Acoustic Love Songs 2024 Playlist Best Of Wish 107.5 Song Playlist 2024 #v9 Welcome to my channel!',
  url: 'https://youtube.com/watch?v=LTvfd_TCL_U',
  videoId: 'LTvfd_TCL_U',
  seconds: 8240,
  timestamp: '2:17:20',
  duration: {
    toString: [Function: toString],
    seconds: 8240,
    timestamp: '2:17:20'
  },
  views: 82215,
  genre: '',
  uploadDate: '2024-8-14',
  ago: 'Streamed 4 days ago',
  image: 'https://i.ytimg.com/vi/LTvfd_TCL_U/hq720.jpg',
  thumbnail: 'https://i.ytimg.com/vi/LTvfd_TCL_U/hq720.jpg',
  author: { name: 'OPM Melody ', url: 'https://youtube.com/@opmmelody74' }
}

However, it’s inconsistent. What could be causing this discrepancy between videos that successfully fetch metadata and those that don’t? Is there a pattern or specific criteria?

Thank you for your help.

talmobi commented 2 months ago

@kevin1113-github hello and thank you for testing ❤️

It may be because the backup function is trying to fill in the missing data by using a regular youtube search of the videoId -- and if a youtube search starts with a "-" like the videoId that you have as an example "-ObdvMkCKws" the regular youtube search errors out.

I'm testing a fix right now~

talmobi commented 2 months ago

@kevin1113-github hello please try patched version npm install yt-search@2.12.1

kevin1113-github commented 2 months ago

@talmobi Thank you for your attention and the update.

I’ve tested it across several cases, and everything seems to be working correctly! However, I did notice that the fetch speed seems to be a bit slower than before.

Thank you for your hard work!