Open Smurk opened 4 years ago
We've detected this behaviour 2 days ago, when our CI builds started failing - somewhere between 9am - 2pm CET.
Has ansible-galaxy collection install
been failing consistently since then?
We've detected this behaviour 2 days ago, when our CI builds started failing - somewhere between 9am - 2pm CET.
Has
ansible-galaxy collection install
been failing consistently since then?
Yes, we are unable to run our CI builds or install collections locally since then.
I confirm that I'm hitting timed out
errors more often in the past few days (ISP: poda.cz). 3-5 retries usually get the job done but I only have 1-2 collections to download in my tests. I must say that there's no proper error processing logic in the ansibly-galaxy
CLI (ansible/ansible repo), no internal retries either but that's a separate issue that may need to be filed against the core repo.
Just to make it clear - for us, even multiple retries do not work. URL version contained in the error message changes each run (see below). When I try to access the URL from the error message, sometimes it loads quickly, sometimes it takes 15 seconds, sometimes it takes 30 seconds. Same behaviour we see from running curl -L
➜ ansible git:(master$) ansible-galaxy collection install community.general
Process install dependency map
ERROR! Unknown error when attempting to call Galaxy at 'https://galaxy.ansible.com/api/v2/collections/community/general/versions/': The read operation timed out
➜ ansible git:(master$) ansible-galaxy collection install community.general
Process install dependency map
ERROR! Unknown error when attempting to call Galaxy at 'https://galaxy.ansible.com/api/v2/collections/ansible/netcommon/versions/?page=7': The read operation timed out
➜ ansible git:(master$) ansible-galaxy collection install community.general
Process install dependency map
ERROR! Unknown error when attempting to call Galaxy at 'https://galaxy.ansible.com/api/v2/collections/community/general/versions/1.2.0/': The read operation timed out
➜ ansible git:(master$) ansible-galaxy collection install community.general
Process install dependency map
ERROR! Unknown error when attempting to call Galaxy at 'https://galaxy.ansible.com/api/v2/collections/ansible/posix/versions/': The read operation timed out
➜ ansible git:(master$) ansible-galaxy collection install community.general
Process install dependency map
ERROR! Unknown error when attempting to call Galaxy at 'https://galaxy.ansible.com/api/v2/collections/ansible/netcommon/versions/?page=2': The read operation timed out
I've opened https://galaxy.ansible.com/api/v2/collections/community/general/versions/ in my browser and it took a few seconds to get a response. It seems like Galaxy is just slow when processing API calls.
Here's timing as shown for such request in DevTools:
The concerning part is that the Time To First Byte is 1.27s. So this basically means that there's either (1) some latency in the response delivery or (2) there's some slow DB lookups on the back-end.
I've made a few refreshes, and these are TTFB values I've got so far: 866.22ms, 1.09s, 1.02s, 1.07s, 1.19s, 1.12s, 1.09s.
This feels quite slow for just one query. With many requests produced in the process of dependency resolution and installation, I can imagine that some of them would be slow and would cause this many timeouts.
I've made a few refreshes, and these are TTFB values I've got so far: 866.22ms, 1.09s, 1.02s, 1.07s, 1.19s, 1.12s, 1.09s.
This feels quite slow for just one query. With many requests produced in the process of dependency resolution and installation, I can imagine that some of them would be slow and would cause this many timeouts.
For comparison, here is output of my curl commads run with time
- time curl -q -L https://galaxy.ansible.com/api/v2/collections/google/cloud/versions/\?page\=2 >/dev/null 2>/dev/null
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 0% cpu 18.080 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 0% cpu 32.997 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 0% cpu 3.384 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 2% cpu 0.839 total
curl -q -L > /dev/null 2> /dev/null 0.01s user 0.01s system 1% cpu 1.191 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 1% cpu 1.214 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 1% cpu 1.335 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 1% cpu 1.706 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 0% cpu 2.364 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 1% cpu 1.942 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 2% cpu 1.081 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 0% cpu 31.734 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 1% cpu 2.037 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 1% cpu 1.345 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 0% cpu 18.154 total
curl -q -L > /dev/null 2> /dev/null 0.02s user 0.01s system 0% cpu 2.772 total
donno what happend but this issue seems disappeared
Nop, still there :(
ansible-galaxy collection download community.network -p /tmp/shishi
Process install dependency map
ERROR! Unknown error when attempting to call Galaxy at 'https://galaxy.ansible.com/api/v2/collections/fortinet/fortios/versions/': The read operation timed out
@cutwater suggested the other day that it may be a problem with Cloudflare... We'll have to wait for somebody with the respective access to check that, I guess.
Bug Report
SUMMARY
When trying to install collection via ansible-galaxy command (does not matter whether collection name is in requirements file or passed as an argument), the command times out. I've tried it from 3 different locations (my 4G mobile internet, my optic fibre internet at home, internet at my work (1Gbits)). I've also asked 5 more people from Czech Republic (all ~30km near city of Brno) and everyone had the same issue. When one of the friends tried running the command from US VPN, it worked as expected. We've detected this behaviour 2 days ago, when our CI builds started failing - somewhere between 9am - 2pm CET.
STEPS TO REPRODUCE
ansible-galaxy collection install community.general
command in Czech RepublicEXPECTED RESULTS
Role is installed successfully
ACTUAL RESULTS
Command ends in error
ERROR! Unknown error when attempting to call Galaxy at 'https://galaxy.ansible.com/api/': The read operation timed out
or similar (the URL changes from run to run).