brimdata / zed

A novel data lake based on super-structured data
https://zed.brimdata.io/
BSD 3-Clause "New" or "Revised" License
1.37k stars 67 forks source link

Improve guidance to user when there's missing AWS config #904

Open philrz opened 4 years ago

philrz commented 4 years ago

As a user trying out our AWS features, I stumbled into at least one behavior that appeared at first like it could be a bug on our part. Closer inspection of the docs for the AWS SDK for Go indicates it's actually a quirk of that SDK and hence not technically "our problem". However, as we rely on the SDK, we're indirectly responsible for its imperfect behaviors, so we might stand to make improvements here. It probably makes sense to prioritize the effort relative to the user uptake on these AWS features.

Let's say I start out on a Linux VM that has no existing AWS tools installed. Unsurprisingly, my first attempt to use an AWS feature in zq (commit 8820a88 at the moment) comes back with an error message.

$ echo '{"foo": "bar"}' | zq -o s3://zq-771/foo.zng -
MissingRegion: could not find region configuration

For starters, it might have helped here if this error (and any other errors we may bubble up from the AWS SDK) were somehow labeled as being specific to AWS. That said, any user trying to use an AWS feature is certain to be familiar with the importance of the Region setting and hence will know what this error is about, so this is not a crisis.

Now let's say the error has made me hip to the absence of base AWS config on my VM. I install and configure the familiar AWS CLI tools.

$ sudo apt install awscli
...
The following NEW packages will be installed:
  awscli docutils-common python3-botocore python3-colorama python3-docutils
  python3-jmespath python3-pyasn1 python3-pygments python3-roman python3-rsa
  python3-s3transfer sgml-base xml-core
...
Setting up awscli (1.14.44-1ubuntu1) ...

$ aws configure
AWS Access Key ID [None]: XXXXXXXXXXXXX
AWS Secret Access Key [None]: XXXXXXXXXXXXX
Default region name [None]: us-west-1
Default output format [None]: 

$ aws s3 ls s3://zq-771
2020-06-12 13:06:01         14 hello.txt

My base AWS CLI config was successful, as evidenced by the fact the ls on my bucket succeeded. zq help doesn't mention anything about AWS config, so as a user I'm led to assume that, like many other AWS tools, zq will likely derive the config from the contents of my $HOME/.aws directory. Alas, my zq command line still does not work.

$ echo '{"foo": "bar"}' | zq -o s3://zq-771/foo.zng -
MissingRegion: could not find region configuration

Now, as a contributor to the Brim projects (i.e. not just a user), I happen to know that zq is relying on the AWS SDK for Go. In the section on "Specifying the AWS Region", the .aws/config is not listed among the things that it looks in automatically. But it does mention that it would start looking there if you set AWS_SDK_LOAD_CONFIG=true. Sure enough;

$ export AWS_SDK_LOAD_CONFIG=true
$ echo '{"foo": "bar"}' | zq -o s3://zq-771/foo.zng -
$ zq -t s3://zq-771/foo.zng
#0:record[foo:string]
0:[bar;]

This seemed strange to me, since in the section on "Specifying Credentials" it indicates that the SDK will look in $HOME/.aws/credentials even without any special environment variables having been set. For instance, if I unset AWS_SDK_LOAD_CONFIG and specify the Region using a different environment variable, this also works, indicating it was still finding & using the credentials without a problem.

$ unset AWS_SDK_LOAD_CONFIG
$ echo '{"foo": "bar"}' | zq -o s3://zq-771/foo.zng -
MissingRegion: could not find region configuration
$ export AWS_REGION=us-west-1
$ zq -t s3://zq-771/foo.zng
#0:record[foo:string]
0:[bar;]

Therefore, I'm of the opinion that this is a hazard that other users may run into.

It seems like we could do one of a few things:

  1. We could implement some new flag like -aws-region and point the user at it in the error message

  2. Instead of implementing a flag, we could point them at the environment variable options. It might be easiest to just mention AWS_REGION, since if we mention AWS_SDK_LOAD_CONFIG it leaves open the possibility that they might set the variable but still be missing a $HOME/.aws/config file, whereas just putting a bad Region in AWS_REGION gets a more helpful error:

$ export AWS_REGION="foo-region"
$ echo '{"foo": "bar"}' | zq -o s3://zq-771/bar.zng -
RequestError: send request failed
caused by: Put "https://zq-771.s3.foo-region.amazonaws.com/bar.zng": dial tcp: lookup zq-771.s3.foo-region.amazonaws.com: no such host
  1. We could punt on significant code changes and instead return a URL to some reference alongside the error messages. One approach might be to link to the AWS SDK for Go docs (IMO this is ok, but not great, since the SDK itself still seems quirky as described above) or write an article on our Wiki that links to the AWS SDK for Go docs, explains how we rely on it, and walks through examples of how to make sure their config is set correctly (better, IMO, since that way we can show a "happy path" that avoids the quirks).
alfred-landrum commented 4 years ago

I wanted mention another option: the session.NewSessionWithOptions function exists in the aws Go sdk, and supports a parameter SharedConfigState: https://docs.aws.amazon.com/sdk-for-go/api/aws/session/#hdr-Creating_Sessions

If passed the correct value: SharedConfigState: session.SharedConfigEnable, it would act as if AWS_SDK_LOAD_CONFIG=true were set in the users environment always. I'd advocate for this, as it at least attempts to load all of the config including the region setting.

philrz commented 3 years ago

I can verify that as of GA zq tagged v0.22.0, having completed the base AWS CLI config is now enough to make this work.

$ echo '{"foo": "bar"}' | zq -o s3://brim-scratch/foo.zng -
$ zq -t s3://brim-scratch/foo.zng
#0:record[foo:string]
0:[bar;]

As noted by @alfred-landrum in #1109 when this improvement was added:

This won't close [this issue], as it doesn't improve the error if a user still doesn't have their config in place, but should help avoid users running into it the first time they try to use S3 objects.

To put it another way, the error MissingRegion: could not find region configuration when AWS config is completely missing is still not as descriptive as it could be.

croteb commented 2 years ago

If you have a complex aws config with for example mfa you will get a panic: panic: AssumeRoleTokenProviderNotSetError: assume role with MFA enabled, but AssumeRoleTokenProvider session option not set.