iv-org / invidious

Invidious is an alternative front-end to YouTube
https://invidious.io
GNU Affero General Public License v3.0
16.39k stars 1.83k forks source link

Videos tab of channel pages are empty on every instance (except invidio.us) #1333

Closed TheFrenchGhosty closed 4 years ago

TheFrenchGhosty commented 4 years ago

Example:

https://invidious.snopyta.org/channel/UCXuqSBlHAE6Xw-yeJA0Tunw

https://invidious.ggc-project.de/channel/UCXuqSBlHAE6Xw-yeJA0Tunw

Invidio.us is the only instance not affected by this issue for whatever reason.

https://invidio.us/channel/UCXuqSBlHAE6Xw-yeJA0Tunw

Edit: according to the comments it's a new rate limit by youtube, invidio.us isn't affected by this because it uses caching.

Edit2: This issue is only present on some channels, and in the meantime RSS still work, this is a frontend issue I think.

GitWaifu commented 4 years ago

Linustips is displayed on main instance because it caches popular channels but when i select other channels on the trending page that became trendy only today the main instance is showing them empty as well. Forked instances of invidio don't cache channel pages and don't access main instance cache either. Google employees changed the way they block invidio requests to prevent all the instances from displaying the old error message that used to say "Google blocked us - try other instance". Google is playing a passive aggressive emo game hoping that if invidio don't tell that Google is bad then users will come back to regular youtube site and stop looking for new invidio instances

zethexx commented 4 years ago

I just rebooted my router to change my IP address, but I still got the same issue. Invidious was still running while the router was rebooted, so it might be because of that (not sure why they would block so early still). Or Google has a way to detect instances (it's a self-hosted single-user instance with dozens of subscriptions). I can still watch videos and get subscriptions. A few days ago this was only happening maybe 1/4 times I opened a channel (and another 1/4 where it would show an incomplete list and/or only uploads from >3 days ago). Now I've reloaded the channel 20 times, and only twice did I get any results (which were the incomplete / uploads from >3 days ago). All other times were blank. Coming back after minutes/hours doesn't work. Seems like they're gradually making the issue worse.

Honestly, f*** google. This is such an asshole move and I knew they were going to do something like this once invidio.us was announced to be shut down and development possibly halted. All the clearnet instances listed on the page have the same issue too.

GitWaifu commented 4 years ago

google never hidden their allegiance to american bible belt capitalism so you should expect more asshole moves in the future especially because of absence of any political or military opposition in and outside of america. 2021 they will encrypt all videos with Widevine Adobe DRM copyright protection and force you to enter your social security number to watch custom tailored pre-roll boner pill ads triggered by your medical insurance data

omar-elnaggar commented 4 years ago

I am also now seeing this on my instance.

neurodiverseEsoteric commented 4 years ago

invidio.us instance has this issue as well

zethexx commented 4 years ago

Changing IP address with invidious service stopped and then started after the IP change didn't work. So they most likely have a way to detect invidious instances or they blocked the method that invidious uses. From my evidence it doesn't seem like a rate limit.

I don't know the specifics of what URL invidious is specifically asking from invidious and what google is giving invidious (there is no error that I can see in the logs and curl "https://www.youtube.com/browse_ajax?continuation" works, so maybe they are deliberately giving empty results), but maybe if google is detecting invidious instances (assuming it would affect newly installed/created instances in a never used before IP) we could change something like the user agent (like searx which does random user agent). just throwing some ideas out

omar-elnaggar commented 4 years ago

@zethexx I also no longer believe that it is a rate limit. It had been in the past, but this seems like a new issue.

Sending a GET request to this address gives you all the videos as a response in XML format. https://www.youtube.com/feeds/videos.xml?channel_id=UCXuqSBlHAE6Xw-yeJA0Tunw

Could it be possible that YouTube changed the way it responds or that the server is parsing it incorrectly? Would this be in any logs anywhere?

zethexx commented 4 years ago

Now that I think of it, it might be because Google changed some stuff and Invidious isn't parsing it correctly. But I don't think it explains why it's been on and off for the past few days until recently, it's not like that 'change' would affect the normal youtube clients (which use official APIs) and newpipe so they wouldn't need partial or A/B testing for it.

It would be handy if someone had a copy of https://www.youtube.com/feeds/videos.xml?channel_id=$ID from over a week/month ago to determine if there was a format change. Seems like it affects all invidious instances world-wide.

wadbr commented 4 years ago

So weirdly enough, the rss feeds of the invidious instances themselves seem to contain all new videos.

RSS feed: https://invidiou.site/feed/channel/UCvtRTOMP2TqYqu51xNrqAzg Channel feed (with no videos): https://invidiou.site/channel/UCvtRTOMP2TqYqu51xNrqAzg

Maybe a temporary solution would be to draw the videos as title only from the rss feed?

GitWaifu commented 4 years ago

in a related bug someone found the specific way how Google is injecting empty characters alongside video urls and breaks all invidious sites - but RSS is immune from that unless Google kills them like they killed Google RSS Reader. The problem with RSS it only shows 10 recent urls so you need to use some RSS app that saves older urls. @FreeTube owner is trying to convert his app away from invidious and use RSS already

girst commented 4 years ago

I have (mostly) solved that problem for subscriptions.gir.st, so let me summarize my findings.

Description of the problem

Invidious uses an internal API to fetch channel videos. Parameters are given as protbuf messges; let's call the format used by Invidious "v1". Recently, youtube have switched the format for some channels ("v2"), for some users (users are in our case instance servers).

If youtube wants your server/IP to use v2 format (for some specific channel), but you use v1 (as invidious does), youtube will return an "Unknown error." If your instance is supposed to use v1 format (for some specific channel), but you're requesting v2, you'll always get the first page of video results with no indication the offset was ignored.

Note that if you have an "v1" IP, some channels will still require v2, just not all of them. I don't know of the other way around (i.e. if there are v2 IPs requiring v1 for some channels).

you can check whether your instance (most likely) wants v2 format with the shell snippet below (requires curl(1) and jq(1)):

curl -v 'https://www.youtube.com/browse_ajax?continuation=4qmFsgJoEhhVQ1h1cVNCbEhBRTZYdy15ZUpBMFR1bncaTEVnWjJhV1JsYjNNWUF5QUFNQUU0QWVvREprTm9iMGxuVEdwMk0wMTVlakZqVWxoRlp6UkxRVUpKUzBOT1lWZHFkR2xFT1hCMmRGZDM%253D&hl=en&gl=US' -H 'x-youtube-client-name: 1' -H 'x-youtube-client-version: 2.20200813.04.02' --compressed|jq '.[1].response.continuationContents.gridContinuation.items[0].gridVideoRenderer|[.title, .publishedTimeText]'

It requests the first item on page 2 of LinusTechtips. if the date is from just a few hours to a few days ago, your instance cannot use v2. if the date says "X weeks ago" (or months even), your instance is supposed to use v2, and accessing v1 will result in an error. I have tested this on four IPv4s (two residential, two datacentre) and got two "v1" and two "v2" results.

Solving the problem

What I'm doing right now is first requesting v1, and if that fails, requesting v2. This is because the before mentioned error condition of v2 (which is to not raise an error, but never return anything but the first page).

If both failed (which Should Never Happen™), i fall back to displaying the RSS feed (note that it only returns the 15 newest videos; no pagination, no sorting, no searching).

My implementation is in python, so I hope most of you can follow how it works. If not, ask!

zethexx commented 4 years ago

So for us who are using the invidious web interface we'll have to wait for someone with the skill and will to implement v2 into invidious, I'm guessing.

dbr commented 4 years ago

I don't have enough spare time currently to look into fixing this, but having never used the Crystal language, I was curious to have a quick poke around to see what would be involved in fixing this:

  1. I mangled the docker/Dockerfile to copy the spec and config folders over (just after copying src in the build stage)

    COPY ./spec/ ./spec/
    COPY ./config/ ./config/
    RUN crystal spec
  2. Added in a very basic test case in specs/helper_specs.cr:

    it "should find videos" do
      get_latest_videos(ucid: "UCNyGbxoEo6CQvaRVEvItxkA").should eq([123])
    end
  3. Tried building with docker -f docker/Dockerfile . - this runs up to the crystal spec test step and falls over with an error:

       Expected: [123]
            got: []

    It's a fairly quick cycle to change the code and run the test - takes less than 20 seconds (probably quite a bit less if you install Crystal locally)

  4. The problem method is in src/invidious/channels.cr, specifically the produce_channel_videos_url method I think is what would need to be updated as per @girst's previous comment (the request object would need tweaked to look consistent with the "v2" format as described in those commits)

From what I can tell, the Crystal language seems fairly intuitive and forgiving - the complex part is just the Youtube API side of things (the obsfucated JSON key names etc). So if anyone was tempted to look into this but put of by the unfamiliar language, I'd suggest attempting giving it a go - the language is very "guessable"!

user234683 commented 4 years ago

@girst The v2 format can always be used as long as you send a cookie value with VISITOR_INFO1_LIVE. I just use a long-expired value VISITOR_INFO1_LIVE=ST1Ti53r4fU I found from the internet. Additionally, the v2 format will not allow you to specify the offset when sorting by oldest, so you can only get the first page. For sorting by oldest, I fallback to the v1 format for pages > 1, which at the moment only works for some channels. See my comment here where I reverse engineered this for more info. I also linked the Python implementation I'm currently using.

There might be hidden parameters in the v2 format when sorting by oldest. It seems that each sorting type uses a different protobuf schema specified by that long number. So when I get time, I'm going to try iterating through all possible field numbers up to some limit in various formats (integer, base64 encoded, etc.) to see if I can find a hidden parameter to specify an offset.

girst commented 4 years ago

On Thu, Aug 20, 2020 at 10:44:21AM -0700, James Taylor wrote:

@girst The v2 format can always be used as long as you send a cookie value with VISITOR_INFO1_LIVE. I just use a long-expired value VISITOR_INFO1_LIVE=ST1Ti53r4fU I found from the internet.

thanks, I'll try that!

Additionally, the v2 format will not allow you to specify the offset when sorting by oldest, so you can only get the first page. For sorting by oldest, I fallback to the v1 format for pages > 1, which at the moment only works for some channels. See my comment https://github.com/iv-org/invidious/issues/1319#issuecomment-671732646 where I reverse engineered this for more info. I also linked the Python implementation I'm currently using.

cool, completely missed that! will have a look later!

There might be hidden parameters in the v2 format when sorting by oldest. It seems that each sorting type uses a different protobuf schema specified by that long number. So when I get time, I'm going to try iterating through all possible field numbers up to some limit in various formats (integer, base64 encoded, etc.) to see if I can find a hidden parameter to specify an offset.

not to discourage you, but that seems like a waste of time to me, considering these magic values are large 64 bit values.

user234683 commented 4 years ago

not to discourage you, but that seems like a waste of time to me, considering these magic values are large 64 bit values.

I meant trying field numbers within the protobuf structure for the 17254859483345278706 case, which is the protobuf format for sorting by oldest, (since the protobuf format seems to be different depending on that magic value), not trying to vary the schema number. I might decide to stop at field_number=128 or something like that, which might give me a couple hundred combinations to try if I try different formats, such as (0, field_number, offset), (2, field_number, embedded[(0, field_number_2, offset)]), etc.

girst commented 4 years ago

ok, that makes a bit more sense. i still don't think you'll get far with brute-force; better to throw some requests to various channels to youtube.com and inspect network traffic.

12people commented 4 years ago

Here's a channel that I ran into not having any videos: https://invidio.us/channel/UCldfgbzNILYZA4dmDt4Cd6A . The main Invidious instance is affected too.

ousia commented 4 years ago

With the following channel,

  1. https://invidious.site/channel/UC2eYFnH61tmytImy1mTYvhA
  2. https://invidio.us/channel/UC2eYFnH61tmytImy1mTYvhA

I get an error with the first url and all videos with the second url.

Just in case it helps.

shanemd commented 4 years ago

Thank you all for working on this issue! I just discovered Invidious today, and thought for a moment that I just couldn't figure out the interface because of the empty Videos tab. Looking forward to a fix. 👍 👊 🙏

zethexx commented 4 years ago

Does anyone know why Snopyta's instance works? I've tried visiting quite a few obscure channels and they worked, and on my self-hosted instance they don't. Both instances are built from the same commit (13f58d6).

rezad1393 commented 4 years ago

Does anyone know why Snopyta's instance works? I've tried visiting quite a few obscure channels and they worked, and on my self-hosted instance they don't. Both instances are built from the same commit (13f58d6).

it didnt work until today. I use Snopyta's instance and it didnt work until i noticed it today.

neurodiverseEsoteric commented 4 years ago

Snopyta no longer works...

git-bruh commented 4 years ago

@esotericDisciple Snopyta still seems to be working, I tried visiting a few random channels and it works fine. Other instances are still broken though

git-bruh commented 4 years ago

Also, the timestamps for videos always show up in search, show up randomly on the popular feed page, and never show up on the channel page. Example: Popular feed (Timestamps show up for random videos) Search (Timestamps always show up) Channel page (No timestamps)

Not sure if a separate issue should be opened for this?

unixfox commented 4 years ago

@git-bruh

Snopyta seems to be working, I tried visiting a few random channels and it works fine. Other instances are still broken though

That's probably due to the fact that snopyta is using this PR which is not merged yet: https://github.com/iv-org/invidious/pull/1355

Also, the timestamps for videos always show up in search, show up randomly on the popular feed page, and never show up on the channel page. Example: Popular feed (Timestamps show up for random videos) Search (Timestamps always show up) Channel page (No timestamps)

Not sure if a separate issue should be opened for this?

Maybe you could report that in https://github.com/iv-org/invidious/pull/1355 or create a new issue because it's unrelated to this current Github issue.

afuous commented 4 years ago

Also, the timestamps for videos always show up in search, show up randomly on the popular feed page, and never show up on the channel page.

This problem was dealt with for channel pages in this commit: https://github.com/iv-org/invidious/pull/1355/commits/97a73e36688c0a6884084f5845761260e3ffc450. I assume the snopyta instance is running one of the previous commits from the pull request.

For the popular feed page, the yewtube instance seems to have the same issue, but isn't using this pull request (you can tell since channel pages don't load). So the problem of missing timestamps on the popular feed page is a separate issue.

TheFrenchGhosty commented 4 years ago

Should be fixed in https://github.com/iv-org/invidious/pull/1355