coiled / feedback

A place to provide Coiled feedback
14 stars 3 forks source link

No region found during coiled setup aws #248

Closed aterrel closed 11 months ago

aterrel commented 11 months ago

(this is matt from andy's computer)

$ coiled setup aws --profile storyfit-dev          

This uses your AWS credentials to setup Coiled.             

╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Current local AWS credentials:  user/andy                                                                                                        │
│ Proposed region for Coiled:     us-east-1       (use coiled setup aws --region to change)                                                        │
│ Proposed account for Coiled:    202150433175                                                                                                     │
│                                                                                                                                                  │
│ If this is not correct then please stop and select a different profile from your AWS credentials file using the coiled setup aws --profile       │
│ argument.                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Attempting to create/update the following resources in your AWS account:                                                                            
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Create IAM User:        coiled                                                                                                                   │
│   and create Access Key for this new IAM User                                                                                                    │
│                                                                                                                                                  │
│ Create IAM Policy:      coiled-setup                                                                                                             │
│   and attach to IAM User coiled                                                                                                                  │
│                                                                                                                                                  │
│ Create IAM Policy:      coiled-ongoing                                                                                                           │
│   and attach to IAM User coiled                                                                                                                  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Documentation for IAM Policies at https://docs.coiled.io/user_guide/aws_configure.html#create-iam-policies

Proceed with IAM setup for Coiled? [y/n] (y): 

The following resources were created in your AWS account:
  arn:aws:iam::202150433175:user/coiled
  arn:aws:iam::202150433175:policy/coiled-setup
  arn:aws:iam::202150433175:policy/coiled-ongoing
IAM User coiled is now setup with IAM Policies attached.
Traceback (most recent call last):
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/bin/coiled", line 8, in <module>
    sys.exit(cli())
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/coiled/cli/setup/aws.py", line 112, in aws_setup
    if not do_setup(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/coiled/cli/setup/aws.py", line 958, in do_setup
    return do_full_setup(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/coiled/cli/setup/aws.py", line 757, in do_full_setup
    check_quotas(session, region=region)
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/coiled/cli/setup/aws.py", line 1100, in check_quotas
    quota_client = session.client("service-quotas")
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/boto3/session.py", line 299, in client
    return self._session.create_client(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/session.py", line 976, in create_client
    client = client_creator.create_client(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/client.py", line 155, in create_client
    client_args = self._get_client_args(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/client.py", line 485, in _get_client_args
    return args_creator.get_client_args(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/args.py", line 92, in get_client_args
    final_args = self.compute_client_args(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/args.py", line 205, in compute_client_args
    endpoint_config = self._compute_endpoint_config(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/args.py", line 313, in _compute_endpoint_config
    return self._resolve_endpoint(**resolve_endpoint_kwargs)
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/args.py", line 418, in _resolve_endpoint
    return endpoint_bridge.resolve(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/client.py", line 590, in resolve
    resolved = self.endpoint_resolver.construct_endpoint(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/regions.py", line 229, in construct_endpoint
    result = self._endpoint_for_partition(
  File "/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/botocore/regions.py", line 277, in _endpoint_for_partition
    raise NoRegionError()
botocore.exceptions.NoRegionError: You must specify a region.

Config for profile:

$ cat ~/.aws/config                                    
[default]
region = us-east-1

[storyfit-dev]
region = us-east-1
aterrel commented 11 months ago
 $ aws --version
aws-cli/2.11.11 Python/3.11.3 Darwin/22.5.0 source/arm64 prompt/off
ntabris commented 11 months ago

Hm, haven't seen this before but AWS libraries often give us new fun edge cases.

If you feel like hacking things, I suspect that changing line 1100 of

/Users/aterrel/miniconda3/envs/storyfit_models_pipeline/lib/python3.10/site-packages/coiled/cli/setup/aws.py

to

    quota_client = session.client("service-quotas", region_name=region)

would fix this (and I have PR to make this change).

aterrel commented 11 months ago

Thanks for the quick response. Sam got the install to work with the command line flag, I'm happy to test your PR but not sure the steps to do so.

ntabris commented 11 months ago

Glad you got something working!

If you don't mind testing the change I made, you could pip install coiled==0.8.9.dev6 and run coiled setup aws --quotas and see if you get an error.

aterrel commented 11 months ago

Seems to work for me.

$ coiled setup aws --quotas
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Coiled runs in your AWS account, so your cluster sizes will be constrained by your AWS quotas:                                                   │
│                                                                                                                                                  │
│ Standard On-Demand                    4512 vCPU                                                                                                  │
│ Standard Spot                         1152 vCPU                                                                                                  │
│ G4dn (NVIDIA T4 GPU) On-Demand         920 vCPU                                                                                                  │
│ G4dn (NVIDIA T4 GPU) Spot               64 vCPU                                                                                                  │
│ P (NVIDIA V100/A100 GPU) On-Demand     692 vCPU                                                                                                  │
│ P (NVIDIA V100/A100 GPU) Spot           64 vCPU                                                                                                  │
│                                                                                                                                                  │
│ Standard includes general purpose M and T families (e.g., M6i, T3), compute optimized C families (e.g., C6i), and memory optimized R families    │
│ (e.g., R6i).                                                                                                                                     │
│                                                                                                                                                  │
│ GPU instances require a separate quota. G4dn is our default GPU instance type on AWS, these have an NVIDIA T4 GPU.                               │
│                                                                                                                                                  │
│ The default Coiled instance type (t3.xlarge) has 4 vCPUs, so a 10 worker cluster (plus scheduler) would have 44 vCPUs.                           │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Your current quota for Standard On-Demand (us-east-1) is 4512. If you'd like to request an increase from AWS, enter the new number (or hit return to
skip): (4512): 
Your current quota for Standard Spot (us-east-1) is 1152. If you'd like to request an increase from AWS, enter the new number (or hit return to 
skip): (1152): 
Your current quota for G4dn (NVIDIA T4 GPU) On-Demand (us-east-1) is 920. If you'd like to request an increase from AWS, enter the new number (or 
hit return to skip): (920): 
Your current quota for G4dn (NVIDIA T4 GPU) Spot (us-east-1) is 64. If you'd like to request an increase from AWS, enter the new number (or hit 
return to skip): (64): 
Your current quota for P (NVIDIA V100/A100 GPU) On-Demand (us-east-1) is 692. If you'd like to request an increase from AWS, enter the new number 
(or hit return to skip): (692): 
Your current quota for P (NVIDIA V100/A100 GPU) Spot (us-east-1) is 64. If you'd like to request an increase from AWS, enter the new number (or hit 
ntabris commented 11 months ago

@aterrel thanks for confirming that the fix worked for you!