TeamNewPipe / NewPipeExtractor

NewPipe's core library for extracting data from streaming sites
GNU General Public License v3.0
1.39k stars 420 forks source link

Soundcloud comments extraction fails for some videos (“Page doesn't contain an URL”) #1243

Open Profpatsch opened 1 day ago

Profpatsch commented 1 day ago

Checklist

Affected version

9fb03f6c87ec0a919ffb56922c5bc6c6bebc6c46

Steps to reproduce the bug

Try to load the comments for e.g. https://soundcloud.com/user-722618400/a-real-playa

Expected behavior

Comments should be returned

Actual behavior

Exception

java.lang.IllegalArgumentException: Page doesn't contain an URL
    at org.schabi.newpipe.extractor.services.soundcloud.extractors.SoundcloudCommentsExtractor.getPage(SoundcloudCommentsExtractor.java:44)
    at org.schabi.newpipe.extractor.comments.CommentsInfo.getMoreItems(CommentsInfo.java:79)
    at org.schabi.newpipe.paging.CommentsSource$load$info$1.invokeSuspend(CommentsSource.kt:23)
    at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
    at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:101)
    at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:113)
    at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:89)
    at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:589)
    at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:823)
    at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:720)
    at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:707)
    Suppressed: kotlinx.coroutines.internal.DiagnosticCoroutineContextException: [LazyStandaloneCoroutine{Cancelling}@16d25a5, Dispatchers.Main.immediate]


Originally posted by @Stypox in https://github.com/TeamNewPipe/NewPipe/issues/11060#issuecomment-2490595741

Screenshots/Screen recordings

No response

Logs

No response

Additional information

See https://github.com/TeamNewPipe/NewPipe/issues/11728

Profpatsch commented 1 day ago

From a very short debugging session, it looks like multiple pages of comments are loaded, and at one point the Page object contains a null URL (probably when there’s no more comments to load).