conda / infrastructure

A repo to report issues and have discussions about the conda infrastructure
BSD 3-Clause "New" or "Revised" License
12 stars 15 forks source link

Seeing "unexpected end of file" issues for non-CDN channel #845

Closed wolfv closed 11 months ago

wolfv commented 11 months ago

What happened?

Since a while a pixi user sees issues with "unexpected end of file" when downloading packages from robostack-staging which is a "non-CDN" channel.

2023-11-05T20:13:05.6283585Z   × error sending request for url (https://conda.anaconda.org/robostack-
2023-11-05T20:13:05.6285841Z   │ staging/linux-64/ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2):
2023-11-05T20:13:05.6286861Z   │ connection error: unexpected end of file
2023-11-05T20:13:05.6287487Z   ├─▶ connection error: unexpected end of file
2023-11-05T20:13:05.6288023Z   ╰─▶ unexpected end of file
2023-11-05T20:13:05.5862693Z DEBUG validating{path=/home/runner/.cache/rattler/cache/pkgs/ros-humble-sros2-0.10.4-py310h7c61026_3}: rattler::package_cache: downloading https://conda.anaconda.org/robostack-staging/linux-64/ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2 to /home/runner/.cache/rattler/cache/pkgs/ros-humble-sros2-0.10.4-py310h7c61026_3
2023-11-05T20:13:05.5872394Z DEBUG validating{path=/home/runner/.cache/rattler/cache/pkgs/ros-humble-sros2-0.10.4-py310h7c61026_3}: rattler_networking::authentication_storage::storage: Unable to retrieve credentials for conda.anaconda.org: Platform secure storage failure: no secret service provider or dbus session found, using fallback credential storage at /home/runner/.rattler/rattler_auth_store.json
2023-11-05T20:13:05.5880002Z  WARN validating{path=/home/runner/.cache/rattler/cache/pkgs/ros-humble-sros2-0.10.4-py310h7c61026_3}: rattler_networking::authentication_storage::fallback_storage: Can't find path for fallback storage on /home/runner/.rattler/rattler_auth_store.json
2023-11-05T20:13:05.5888109Z DEBUG validating{path=/home/runner/.cache/rattler/cache/pkgs/ros-humble-sros2-0.10.4-py310h7c61026_3}: rattler_networking::authentication_storage::storage: Unable to retrieve credentials for *.conda.anaconda.org: Platform secure storage failure: no secret service provider or dbus session found, using fallback credential storage at /home/runner/.rattler/rattler_auth_store.json
2023-11-05T20:13:05.5896238Z  WARN validating{path=/home/runner/.cache/rattler/cache/pkgs/ros-humble-sros2-0.10.4-py310h7c61026_3}: rattler_networking::authentication_storage::fallback_storage: Can't find path for fallback storage on /home/runner/.rattler/rattler_auth_store.json
2023-11-05T20:13:05.5901082Z DEBUG validating{path=/home/runner/.cache/rattler/cache/pkgs/ros-humble-sros2-0.10.4-py310h7c61026_3}: hyper::client::pool: reuse idle connection for ("https", conda.anaconda.org)

I wonder if this could be related to the anaconda.org infrastrucutre changes? Or rate limiting (this is a custom github runner, so has different IP from Github)?

Any hints appreciated!

jezdez commented 11 months ago

Just noting that I've seen this and pinged Anaconda infra folks about it.

wolfv commented 11 months ago

Thanks @jezdez – I think on our end we might try to up the retries. Another thing to note is that the user mentioned this started to happen only in the last ~3 weeks or so, that's why I thought that it might be related to the infra changes on Anaconda's side.

jezdez commented 11 months ago

Thanks @jezdez – I think on our end we might try to up the retries. Another thing to note is that the user mentioned this started to happen only in the last ~3 weeks or so, that's why I thought that it might be related to the infra changes on Anaconda's side.

Happy to, Wolf! I don't have knowledge about any larger changes, but anecdotally we've seen the conda tests fail with retry issues as well recently, and I wonder if that is related.

rasquith commented 11 months ago

Hi Wolf, thank you very much for reporting this. Currently we're adding a ticket to track this issue. We have not yet implemented the upgrade -- we've been running it through some final testing. Once we implement our upcoming infrastructure upgrade, and if this is issue is still extant, we'll look into prioritizing it.

barabo commented 11 months ago

The package downloads correctly for me, so maybe this was an intermittent issue. We were having some upstream issues with Cloudflare around the time of the problems, so we can't rule that out.

(base) canderson@carls-mbp-2 missing_sha256 % wget https://conda.anaconda.org/robostack-staging/linux-64/ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2
--2023-11-07 11:23:12--  https://conda.anaconda.org/robostack-staging/linux-64/ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2
Resolving conda.anaconda.org (conda.anaconda.org)... 104.17.15.67, 104.17.16.67
Connecting to conda.anaconda.org (conda.anaconda.org)|104.17.15.67|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-tar]
Saving to: ‘ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2’

ros-humble-sros2-0.10.4-py310h7c61026_3.tar.     [ <=>                                                                                           ]  58.69K  --.-KB/s    in 0.06s

2023-11-07 11:23:13 (991 KB/s) - ‘ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2’ saved [60096]

(base) canderson@carls-mbp-2 missing_sha256 % file ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2
ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2: bzip2 compressed data, block size = 900k
(base) canderson@carls-mbp-2 missing_sha256 % tar -xjf ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2
(base) canderson@carls-mbp-2 missing_sha256 % ll
total 29720
drwxr-xr-x  9 canderson  staff       288 Nov  7 11:24 info
drwxr-xr-x  3 canderson  staff        96 Nov  7 11:24 lib
-rw-r--r--  1 canderson  staff     60096 Feb  6  2023 ros-humble-sros2-0.10.4-py310h7c61026_3.tar.bz2
drwxr-xr-x  4 canderson  staff       128 Nov  7 11:24 share
jezdez commented 11 months ago

@barabo I'm interpreting this as a "worksforme" and will close it. @wolfv please let us know if this is still happening.

wolfv commented 11 months ago

Yeah, I think it was always "working" and it's a random package that fails. But if you didn't make the infrastructure changes yet, idk why it changed. Again, we should probably increase retry limits and backoff time.