ThioJoe / YT-Spammer-Purge

Allows you easily scan for and delete scam comments using several methods.
GNU General Public License v3.0
4.58k stars 389 forks source link

[Bug]: Channel URL gets not decoded to the right channel #455

Closed CamFlyerCH closed 1 year ago

CamFlyerCH commented 2 years ago

Duplicate Issues

What happened?

Wenn I try to scann recent videos from some friends channels, the link get's decoded to the wrong channel.

Example: https://www.youtube.com/c/NimaGmbHWellnesspoolsOstermundigen

I found by clicking on a subscribed channel list a link with an ID for the same channel: https://www.youtube.com/channel/UCV3I8eXxn3GW4Vwi465Nzeg

Also this one did not work, but there is an umlaut in it: https://www.youtube.com/c/RitaInderm%C3%BChle

Is there a way to improve the detection of the right channel or is there a way to find the channel ID (link) with some easy method ?

Release version

2.14.0

Steps to reproduce

  1. 2 - Scan recent Videos
  2. Enter URL https://www.youtube.com/c/NimaGmbHWellnesspoolsOstermundigen
  3. Detects the channel "Gunnar Schuster" instead of "NIMA Wellnesspools"

What platform are you seeing this problem on?

Windows (.exe file)

Relevant log output

No response

Screenshots

image

ThioJoe commented 2 years ago

Ah yea I'll have to see if I can improve this. It happens on occasion because the way to get info using a URL, you have to technically do a 'search' query, and sometimes the API responds with a weird result for some channels.

dav1312 commented 2 years ago

Another example of going to the wrong channel Pasting this url: https://www.youtube.com/c/Berdboi Ends up going to this channel: https://www.youtube.com/channel/UCB8AUVOI8XRhlbDOOX3qK5A

ThioJoe commented 2 years ago

Copying this comment I made in #624 :

A couple things to add that I've found. The YouTube API offers absolutely zero way to do a direct query to retrieve the channel ID if the /c/ channel URL is given, for some reason. It does however, offer a way to directly get the channel ID if it's a legacy username, so for "youtube.com/whatever", you can query for just "whatever". (Except apparently not always? See example below)

The problem with that is, sometimes the /c/ name is different from the legacy name. So if I were to add a thing to query the username "mkbhd" for example, it actually returns this channel id: UCmf_VrB73I-eJ3fq0adaOkg, which is not MKBHD's main channel: https://www.youtube.com/channel/UCmf_VrB73I-eJ3fq0adaOkg

In fact, it's not even the same channel that comes up if you type in youtube.com/mkbhd. (I think at some point, YouTube started using the /c/whatever URL if someone typed in just youtube.com/whatever. So unless they're the same, it might not give the same result) So there's not really any perfect solution.

CamFlyerCH commented 2 years ago

The best method I found to get the ID is a follows:

  1. Open the channel page
  2. Start one video
  3. Click on the channel icon below the video
  4. Copy the now open URL

That is quite fast. I am happy with that workaround.

ZekePolarisBSH commented 2 years ago

I have the same issue.

Firecul commented 2 years ago

I have the same issue.

Then the best solution is still to use the ID instead of the old custom URL

ethnh commented 2 years ago
Python 3.10.2 (main, Jan 15 2022, 19:56:27) [GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> from bs4 import BeautifulSoup
>>> resp = requests.get('https://www.youtube.com/c/NimaGmbHWellnesspoolsOstermundigen')
>>> soup = BeautifulSoup(resp.text, 'html.parser')
>>> channel_id = soup.select_one('meta[property="og:url"]')['content'].strip('/').split('/')[-1]
>>> channel_id
'UCV3I8eXxn3GW4Vwi465Nzeg'
>>> #real channel URL : https://www.youtube.com/channel/UCV3I8eXxn3GW4Vwi465Nzeg
>>> resp = requests.get('https://www.youtube.com/c/RitaInderm%C3%BChle')
>>> soup = BeautifulSoup(resp.text, 'html.parser')
>>> channel_id = soup.select_one('meta[property="og:url"]')['content'].strip('/').split('/')[-1]
>>> channel_id
'UCGktqHoMUkhnTxaK-9KVJmQ'
>>> # real channel url: https://www.youtube.com/channel/UCGktqHoMUkhnTxaK-9KVJmQ
>>> resp = requests.get('https://www.youtube.com/c/Berdboi')
>>> soup = BeautifulSoup(resp.text, 'html.parser')
>>> channel_id = soup.select_one('meta[property="og:url"]')['content'].strip('/').split('/')[-1]
>>> channel_id
'UCRei8TBpt4r0WPZ7MkiKmVg'
>>> # real channel url: https://www.youtube.com/channel/UCRei8TBpt4r0WPZ7MkiKmVg
ethnh commented 2 years ago

In the HTML of the custom-named URLs, you can find: image

All you have to do is pull it out👍

DinhHuy2010 commented 2 years ago
Python 3.10.2 (main, Jan 15 2022, 19:56:27) [GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> from bs4 import BeautifulSoup
>>> resp = requests.get('https://www.youtube.com/c/NimaGmbHWellnesspoolsOstermundigen')
>>> soup = BeautifulSoup(resp.text, 'html.parser')
>>> channel_id = soup.select_one('meta[property="og:url"]')['content'].strip('/').split('/')[-1]
>>> channel_id
'UCV3I8eXxn3GW4Vwi465Nzeg'
>>> #real channel URL : https://www.youtube.com/channel/UCV3I8eXxn3GW4Vwi465Nzeg
>>> resp = requests.get('https://www.youtube.com/c/RitaInderm%C3%BChle')
>>> soup = BeautifulSoup(resp.text, 'html.parser')
>>> channel_id = soup.select_one('meta[property="og:url"]')['content'].strip('/').split('/')[-1]
>>> channel_id
'UCGktqHoMUkhnTxaK-9KVJmQ'
>>> # real channel url: https://www.youtube.com/channel/UCGktqHoMUkhnTxaK-9KVJmQ
>>> resp = requests.get('https://www.youtube.com/c/Berdboi')
>>> soup = BeautifulSoup(resp.text, 'html.parser')
>>> channel_id = soup.select_one('meta[property="og:url"]')['content'].strip('/').split('/')[-1]
>>> channel_id
'UCRei8TBpt4r0WPZ7MkiKmVg'
>>> # real channel url: https://www.youtube.com/channel/UCRei8TBpt4r0WPZ7MkiKmVg

I already khow that.

See https://github.com/ThioJoe/YT-Spammer-Purge/discussions/655

DinhHuy2010 commented 2 years ago

uh close the issue, that is inactive?

ethnh commented 2 years ago

uh close the issue, that is inactive?

Issues are not closed until resolved, is this issue resolved? I cannot test the latest version right now 👍

DinhHuy2010 commented 2 years ago

?????????????????

ThioJoe commented 1 year ago

This should be improved in the latest beta. Turns out there is a way to filter the API search results to only be channels, whereas before it would search videos and channels, which would often return bad results for small channels.

Now it should be a lot less likely to return false channel results. And if the API doesn't return it at all for some reason, which it still might do for small channels, the result should be empty and at least tell the user that.

Lightning11wins commented 1 year ago

Oh, that makes sense! I'm so glad this is fixed because it was pretty annoying in the past.

ZekePolarisBSH commented 1 year ago

Oh, that makes sense! I'm so glad this is fixed because it was pretty annoying in the past.

It still doing this to me and the linked helps didn't help me...