TeamNewPipe / NewPipe

A libre lightweight streaming front-end for Android.
https://newpipe.net
GNU General Public License v3.0
31.29k stars 3.04k forks source link

Feed aborts loading videos/subscribers for channels with pronouns #11353

Open lmsalustri opened 2 months ago

lmsalustri commented 2 months ago

Checklist

Affected version

0.27.2

Steps to reproduce the bug

  1. Pull up to load feed
  2. Affected channel doesn't load, error occurs

Expected behavior

The channel videos should load properly.

Actual behavior

"Not loaded: 1" shows up at the top, and the channel videos don't load because it can't fetch the subscriber count.

Screenshots/Screen recordings

Screenshot_20240727_145035_NewPipe.jpg

Logs

Exception

org.schabi.newpipe.local.feed.service.FeedLoadService$RequestException: 0:https://www.youtube.com/channel/UCOuw-jFRsAgZtyDXz8e4v3A
    at org.schabi.newpipe.local.feed.service.FeedLoadManager$DatabaseConsumer.accept$lambda$1(FeedLoadManager.kt:285)
    at org.schabi.newpipe.local.feed.service.FeedLoadManager$DatabaseConsumer.$r8$lambda$nP61ZaEPy7GNixUaWP6wyXDQaWo(FeedLoadManager.kt:0)
    at org.schabi.newpipe.local.feed.service.FeedLoadManager$DatabaseConsumer$$ExternalSyntheticLambda0.run(R8$$SyntheticClass:0)
    at androidx.room.RoomDatabase.runInTransaction(RoomDatabase.kt:585)
    at org.schabi.newpipe.local.feed.service.FeedLoadManager$DatabaseConsumer.accept(FeedLoadManager.kt:271)
    at org.schabi.newpipe.local.feed.service.FeedLoadManager$DatabaseConsumer.accept(FeedLoadManager.kt:268)
    at io.reactivex.rxjava3.internal.operators.flowable.FlowableDoOnEach$DoOnEachSubscriber.onNext(FlowableDoOnEach.java:86)
    at io.reactivex.rxjava3.internal.operators.flowable.FlowableBuffer$PublisherBufferExactSubscriber.onNext(FlowableBuffer.java:124)
    at io.reactivex.rxjava3.internal.operators.flowable.FlowableObserveOn$ObserveOnSubscriber.runAsync(FlowableObserveOn.java:403)
    at io.reactivex.rxjava3.internal.operators.flowable.FlowableObserveOn$BaseObserveOnSubscriber.run(FlowableObserveOn.java:178)
    at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.run(ScheduledRunnable.java:65)
    at io.reactivex.rxjava3.internal.schedulers.ScheduledRunnable.call(ScheduledRunnable.java:56)
    at java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:307)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:644)
    at java.lang.Thread.run(Thread.java:1012)
Caused by: org.schabi.newpipe.extractor.utils.Parser$RegexException: Failed to find pattern "([\d]+([\.,][\d]+)?)" inside of "@BitByter"
    at org.schabi.newpipe.extractor.utils.Parser.matchGroup(Parser.java:75)
    at org.schabi.newpipe.extractor.utils.Parser.matchGroup(Parser.java:60)
    at org.schabi.newpipe.extractor.utils.Parser.matchGroup1(Parser.java:49)
    at org.schabi.newpipe.extractor.utils.Utils.mixedNumberWordToLong(Utils.java:93)
    at org.schabi.newpipe.extractor.services.youtube.extractors.YoutubeChannelExtractor.getSubscriberCountFromPageChannelHeader(YoutubeChannelExtractor.java:303)
    at org.schabi.newpipe.extractor.services.youtube.extractors.YoutubeChannelExtractor.getSubscriberCount(YoutubeChannelExtractor.java:248)
    at org.schabi.newpipe.extractor.channel.ChannelInfo.getInfo(ChannelInfo.java:86)
    at org.schabi.newpipe.extractor.channel.ChannelInfo.getInfo(ChannelInfo.java:53)
    at org.schabi.newpipe.util.ExtractorHelper.lambda$getChannelInfo$4(ExtractorHelper.java:126)
    at org.schabi.newpipe.util.ExtractorHelper.$r8$lambda$BOLWstv98dC8pFAG_uir5gPXYwY(ExtractorHelper.java:0)
    at org.schabi.newpipe.util.ExtractorHelper$$ExternalSyntheticLambda13.call(R8$$SyntheticClass:0)
    at io.reactivex.rxjava3.internal.operators.single.SingleFromCallable.subscribeActual(SingleFromCallable.java:43)
    at io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
    at io.reactivex.rxjava3.internal.operators.single.SingleDoOnSuccess.subscribeActual(SingleDoOnSuccess.java:35)
    at io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
    at io.reactivex.rxjava3.internal.operators.single.SingleOnErrorReturn.subscribeActual(SingleOnErrorReturn.java:38)
    at io.reactivex.rxjava3.core.Single.subscribe(Single.java:4855)
    at io.reactivex.rxjava3.core.Single.blockingGet(Single.java:3644)
    at org.schabi.newpipe.local.feed.service.FeedLoadManager.loadStreams(FeedLoadManager.kt:179)
    at org.schabi.newpipe.local.feed.service.FeedLoadManager.access$loadStreams(FeedLoadManager.kt:33)
    at org.schabi.newpipe.local.feed.service.FeedLoadManager$startLoading$7.apply(FeedLoadManager.kt:113)
    at org.schabi.newpipe.local.feed.service.FeedLoadManager$startLoading$7.apply(FeedLoadManager.kt:112)
    at io.reactivex.rxjava3.internal.operators.parallel.ParallelMap$ParallelMapSubscriber.onNext(ParallelMap.java:116)
    at io.reactivex.rxjava3.internal.operators.parallel.ParallelFilter$ParallelFilterSubscriber.tryOnNext(ParallelFilter.java:132)
    at io.reactivex.rxjava3.internal.operators.parallel.ParallelRunOn$RunOnConditionalSubscriber.run(ParallelRunOn.java:399)
    ... 7 more


Affected Android/Custom ROM version

Android 14 / OneUI 6.1

Affected device model

Samsung Galaxy S22 Ultra

Additional information

The affected channel has 229 subscribers at the time of writing. The subscriber count is correct when searching the channel in NewPipe. Clicking on the channel shows "Subscriber count unavailable" as in #10826, but avatar and banner load fine.

The "caused by" portion of the log is the same as #9442, but this error occurs in feed loading, not search. What I find odd is that my log doesn't show "Could not get subscriber count" like that one does.

Unsubscribing and resubscribing does not fix this. Haven't seen this error in any prior version.

afontenot commented 2 days ago

The issue is with any channel that has a pronouns field. Maybe the title could be updated to reflect this?

I've done a little digging and I think the following two underlying issues are the cause:

In YoutubeChannelExtractor.java:

the metadata in the header is narrowed with the following filter:

.filter(metadataParts -> metadataParts.size() == 2)
.findFirst()

The comment explains that this is to "Find metadata parts which have two elements: channel handle and subscriber count." This approach appears to be outdated now, and only works by accident with most channels. Look at the JSON for a channel with no pronouns:

[
  {
    "metadataParts": [
      {
        "text": {
          "content": "@microzoe"
        },
        "enableTruncation": true
      }
    ]
  },
  {
    "metadataParts": [
      {
        "text": {
          "content": "2.95K subscribers"
        }
      },
      {
        "text": {
          "content": "3 videos",
          "styleRuns": [
            {
              "startIndex": 0,
              "length": 8
            }
          ]
        }
      }
    ]
  }
]

That's not what the comment describes - the subscriber count is no longer in the same metadataParts as the channel name. It just happens to still be in the first metadataParts that has length 2. But this is not true if the channel header contains pronouns. In this case the first metadataParts looks like this:

      {
        "text": {
          "content": "@anarchozoe"
        },
        "enableTruncation": true
      },
      {
        "text": {
          "content": "she/her"
        }
      }

So here this metadataParts will be selected and the Regex will fail to extract anything (which is why the channel title is printed in the crash log).

In addition, given that extracting subscriber count with a Regex is flaky in the first place (as the comments note), the code should be changed to catch the RegexException and return UNKNOWN_SUBSCRIBER_COUNT. There's no good reason for updating subscriptions to fail just because NewPipe was unable to extract the subscriber count.

Ideally an expert in this code would come by and fix these issues; I'm not certain what issues there might be in a naive approach like grabbing the second metadataParts - in all the channels I've checked, the subscriber count is currently in that row, and the first row contains the channel title.

If no one fixes it in a month or so I can do a PR to get the ball rolling.

lmsalustri commented 23 hours ago

Updated the title. Thanks for the info!