dgzlopes / cloud-detect

Module that determines a host's cloud provider.
https://pypi.org/project/cloud-detect/
MIT License
35 stars 13 forks source link

reporting unknown cloud takes very long #25

Open derekjc opened 1 year ago

derekjc commented 1 year ago

Executing cloud-detect on my laptop, I get an unknown after a very long time. This also happens in my openstack environment. It appears that the metadata url used by alibaba provider is a public address and takes a while before erroring. Adding a timeout to ClientSession helps fix this.

time python3 -c 'from cloud_detect import provider; provider()'
python3 -c 'from cloud_detect import provider; provider()'  0.37s user 0.06s system 0% cpu 2:11.18 total
time curl -sq http://100.100.100.200/latest/meta-data/latest/meta-data/instance/virtualization-solution
curl -sq   0.01s user 0.02s system 0% cpu 2:10.17 total
kshivakumar commented 1 year ago

@derekjc You can pass timeout parameter. https://github.com/dgzlopes/cloud-detect/blob/23fa390d74d7ee435801105f29f625a2ac4907bc/cloud_detect/__init__.py#L71

derekjc commented 1 year ago

@kshivakumar i could pass that... but IMHO a default timeout especially considering that alibaba uses a public routable IP would be nice.

kshivakumar commented 1 year ago

@derekjc It's difficult to choose an optimal default value for timeout. Choose a low value and you risk not completing a request(it's rare, but possible), choose a high value and it may not be any different from not having a default value, in practical terms. That's why the decision is left to the client code. One thing that can be improved is to mention the timeout in the README so that the users are aware of the option.

derekjc commented 1 year ago

@kshivakumar From my understanding, metadata is routed via the hypervisor host and is not a remote call and hence shouldn't need much time. Without a default timeout, cloud_detect takes more than 2 minutes to return an unknown on unsupported clouds. A better value could be similar to what the battle tested cloud-init uses(I think it is 50 seconds). Additionally, even if it times out too early, won't the vendor file return the right response?

If I've not convinced you, please close this issue.

kshivakumar commented 1 year ago

@derekjc For all the cloud providers that support "vendor file check" the file is checked first - https://github.com/dgzlopes/cloud-detect/blob/23fa390d74d7ee435801105f29f625a2ac4907bc/cloud_detect/providers/alibaba_provider.py#L25 The metadata_url is called only when the file check fails. So, metadata_url is the last test to confirm the vendor.

Even though Alibaba's url seems public, it's no different from others n practice. Except for GCP's all other urls take around 2 mins before curl shows "Connection timed out".

When I worked on the asyncio changes I put a timeout of 5s (😄) in my first commit. After lot of contemplation I decided to remove it for two reasons(at that time):

I was not aware of "cloud-init", thanks for sharing. I skimmed through their code and found there are different timeouts for different vendors, saw 30s for one. I couldn't find the maximum. Also, while checking commit history found they updated some of the timeouts a couple times. I am sure the current timeouts are going to be updated again in the future.

Let's come back to this discussion when we have more no. of users or if more people ask for this feature.

derekjc commented 1 year ago

@kshivakumar sounds good to me!