r-multiverse / help

Discussions, issues, and feedback for R-multiverse
https://r-multiverse.org
MIT License
2 stars 2 forks source link

commuity.r-multiverse.org failing when added to CI #68

Closed shikokuchuo closed 1 week ago

shikokuchuo commented 1 week ago

The problem manifests itself for example here: https://github.com/shikokuchuo/mirai/actions/runs/9641056206/job/26586246583#step:6:143

pak::pak() claims:

source packages are missing from community.r-multiverse.org

This doesn't reproduce if I run pak::pak() locally.

Also it seems fine to grab the binaries for Windows and Mac so those workflows succeed, e.g. at: https://github.com/shikokuchuo/mirai/actions/runs/9641056206/job/26586247306#step:6:1188

Adding the repo is via amending the GitHub workflow yaml as per the below:

      - name: Add r-universe to repos
        run: |
          cat("\noptions(repos=c(RMV='https://community.r-multiverse.org',CRAN ='https://cloud.r-project.org'))\n", file = "~/.Rprofile", append = TRUE)
        shell: Rscript {0}

This is what I use all the time, and works fine if I swap in my personal R-universe instead.

@jeroen it would be great if you have an idea what this is - whether it is something to do with the re-direct, or a bug that can be fixed in pak itself? Thanks.

cc. @wlandau

jeroen commented 1 week ago

Hmm weird, could you try the dev version of pak? https://github.com/r-lib/pak/issues/266#issuecomment-2156115228

jeroen commented 1 week ago

@gaborcsardi what does the error source packages are missing from community.r-multiverse.org mean?

gaborcsardi commented 1 week ago

Fails to get the metadata from that site. Either a HTTP error or the file does not parse as a PACKAGES file.

jeroen commented 1 week ago

This is very strange. If I change the domain to https://r-multiverse.r-universe.dev then the problem disappears, but they point to the same backend. Perhaps something in your r-multiverse.org cloudflare settings that triggers a bug?

I'm trying to find a way to figure out what pak is actually downloading and parsing.

shikokuchuo commented 1 week ago

Strange indeed. I don't have anything additional set up for Cloudflare. I've invited you to the Cloudflare account. I'm happy for you to transfer the settings to the account you use for R-universe, and then I can disable this one.

jeroen commented 1 week ago

Thanks. I can see your clouldflare settings now but somehow I can't edit them yet (I think it takes a while). Could you try the following for me: under speed > optimization you can disable http/3. Also enable http2 to origin.

shikokuchuo commented 1 week ago

I've changed those settings. Doesn't seem to make a difference...

shikokuchuo commented 1 week ago

Very strange I'm seeing identical payloads for https://community.r-multiverse.org/src/contrib/PACKAGES and https://r-multiverse.r-universe.dev/src/contrib/PACKAGES. Testing using nanonext::ncurl() locally and using base url() and readLines() on the demo webr platform.

jeroen commented 1 week ago

OK I was able to reproduce the problem.

Somehow cloudflare identifies pak as a potential ddos-bot / spammer, and shows an interactive captcha challenge instead of the content. I have no idea why this doesn't happen for other domains or clients.

@shikokuchuo in your cloudflare settings, under "security" > "settings" can you try to set Security Level to the lowest possible value?

shikokuchuo commented 1 week ago

Fantastic! I can confirm that this is the issue, and it now works e.g. https://github.com/shikokuchuo/mirai/actions/runs/9646460548/job/26606426809

If you see security > events, you can see "bot fight mode" triggering. I think it's because it detects the request coming from GHA (a cloud platform) and sees it likely as a bot attempt. I've disabled this without changing the security setting. I turned on the AI scrape shield at the same time!

Thanks for investigating this.