dgzlopes / cloud-detect

Module that determines a host's cloud provider.
https://pypi.org/project/cloud-detect/
MIT License
35 stars 13 forks source link

Detect specific supported cloud providers #22

Open dwc0011 opened 1 year ago

dwc0011 commented 1 year ago

Cloud detect has the capability to detect multiple cloud providers, however some applications may only support a few cloud providers and trying to detect the others is a waste of time and resources. I propose accepting a list of cloud providers that a user wants to detect.

I have attached sample code below but would me more than willing to create a PR to implement this if you are open to this feature request.

For example an application may support running on AWS, Azure, or GCP but nothing else. The application would only care if it is one of those so attempting to detect others makes no sense to the using app.

A possible solution while keeping it dynamic would be to modify the cloud_detect/init.py class so that the usage would look like the sample script and the init.py would look like the code below the sample script.

#sample.py
from cloud_detect import provider

MY_APP_PROVIDERS = ['aws', 'azure', 'gcp']

def do_aws_work(p_id):
  print(p_id == "aws")

def do_azure_work(p_id):
  print(p_id == "azure")

def do_gcp_work(p_id):
  print(p_id == "gcp")

def error(id):
  print("Error unknown id = " + id)

MY_APP_PROVIDERS = {
      'aws': do_aws_work, 
      'azure': do_azure_work, 
      'gcp': do_gcp_work, 
}

def detect_env():
       only_these = [key for key in MY_APP_PROVIDERS]
       provider_id  = provider(only_these)

       MY_APP_PROVIDERS.get(provider_id, error)(provider_id)
#cloud_detect/__init__.py
__PROVIDER_CLASSES = {
    AlibabaProvider.identifier: AlibabaProvider, 
    AWSProvider.identifier :AWSProvider, 
    AzureProvider.identifier: AzureProvider,
    DOProvider.identifier: DOProvider,
    GCPProvider.identifier: GCPProvider,
    OCIProvider.identifier: OCIProvider
}

async def _identify(timeout, providers= None):

     if not providers:
        providers = [identifier for identifier in __PROVIDER_CLASSES]
       ......
       ......
       tasks = {
        __PROVIDER_CLASSES[p_id].identifier : asyncio.ensure_future(wrapper(__PROVIDER_CLASSES[p_id])) for p_id in providers if p_id in __PROVIDER_CLASSES
    }
        ......

def provider(timeout=None, providers= None,):
     .....
     .....
     if py_version.minor >= 7:
        result = asyncio.run(_identify(timeout, providers))
    else:
        loop = asyncio.new_event_loop()
        result = loop.run_until_complete(_identify(timeout, providers))
        loop.close()
    return result
........
kshivakumar commented 1 year ago

@dwc0011

...trying to detect the others is a waste of time and resources

There's no waste of time since all the providers are checked concurrently. When the correct provider is identifier, other checks are cancelled immediately. I don't think any application that uses this library will check the cloud provider every other minute or hour, making few additional http requests once in a while is not waste of resources. Moreover, by not having to pass my_providers to the provider function we keep our api simple.

The only case where the library takes considerable time to respond is when the environment is not one of the supported cloud providers. See this comment for additional info - https://github.com/dgzlopes/cloud-detect/pull/12#discussion_r787572375. When the environment is supported, the response is almost instantaneous.

dwc0011 commented 1 year ago

@kshivakumar

Moreover, by not having to pass my_providers to the provider function we keep our api simple.

I understand that you wish to keep the api simple, but at the same time I understand the need for not checking for resources that will never be. An optional param is also still pretty simple to me, heck this feature was even part of the API up until April, just as an opposite "excluded" param.

The only case where the library takes considerable time to respond is when the environment is not one of the supported cloud providers.

This is the exact case I am trying to prevent, we wanted to use in our application which does run quite often on environments that are not supported. The more cloud providers it has to check the more resources and time that is wasted, even if concurrent.

I appreciate you responding and hope that you reconsider but I understand either way. Feel free to close if your decision is final. Thanks.