aws-samples / connected-drink-dispenser-workshop

Code and walk-through to assemble, program and build a multi-user Amazon FreeRTOS and AWS IoT enabled drink dispenser.
MIT No Attribution
40 stars 20 forks source link

Some bugs when running cdk synth #11

Closed gytelek closed 4 years ago

gytelek commented 4 years ago

I found following bugs when running cdk synth:

gadams999 commented 4 years ago

Hi, the hard-coded us-east-1 region is used by CloudFront (see the AWS Region that You Request a Certificate In section of the documents). Your AWS CLI profile's region is the one that should be used. I have tested (ran the workshops) in us-west-2, so other regions should be fine as long as the ACM certificate resides in us-east-1.

I think on the static_site_construct.py there was a a breaking change in the CDK 1.18.0 or later. I'll review and correct using the latest version of CDK (1.19.0) in the next couple of days. For now, you can try installing CDK 1.17.0 via npm and test that.

gytelek commented 4 years ago

Hi Gavin,

I managed to deploy the cdd stack, next step is to deploy the application. Regarding the two issues I listed in the first comment: 1) as you said it is not a bug it is needed for CloudFront. I changed it back to us-east-1. 2) I didn't downgrade to CDK 1.17. I simply removed the line having the invalid parameter and it worked with CDK 1.19. Thank you for your support.

gadams999 commented 4 years ago

Hi @gytelek ,

I've confirmed there is a change in CDK 1.18.0 to 1.19.0 and above. The origin_access_identity_id name changed, but there are also some new changes where scope constructs have been added. In the meantime I've committed to the master branch a change that will pin the CDK version to 1.18.0 for the Python packages.

When you do a pip install -r requirements.txt, there are errors shown about inconsistent versions of dependent packages, and a pip list will show some packages at 1.18.0, and others at 1.20.0. However, a cdk synth or cdk deploy will work.

The Origin Access Identity is needed by CloudFront to access the S3 bucket (it is locked down by default), so if you have the time, I'd suggest:

  1. Destroy the current stack
  2. Delete the virtualenv
  3. Pull from the master branch the latest changes
  4. Clear out the cdk.out directory
  5. cdk synth to test, then cdk deploy

I'll keep this open until we can resolve the pinned versions of CDK and update the Python code to reflect the latest CDK changes.

gytelek commented 4 years ago

OK, I am doing what you suggested but it needs time (I do it in my free time not as part of my job). I will come back to you if I managed to deploy the stack again.

gytelek commented 4 years ago

Hi,

I managed to deploy the stack again, it didn't take too much time. Now I am trying to deplay the app (from a Cloud9 environment) but it fails:

(.env) ec2-user:~/environment/connected-drink-dispenser-workshop/deploy (master) $ python3 deploy_app.py Verifying local configuration files Reading CloudFormation stack parameters to create files for web application Clearing S3 bucket of ALL objects yarn install v1.21.1 [1/4] Resolving packages... [2/4] Fetching packages... warning sha.js@2.4.11: Invalid bin entry for "sha.js" (in "sha.js"). warning url-loader@1.1.2: Invalid bin field for "url-loader". info fsevents@1.2.9: The platform "linux" is incompatible with this module. info "fsevents@1.2.9" is an optional dependency and failed compatibility check. Excluding it from installation. info fsevents@2.1.2: The platform "linux" is incompatible with this module. info "fsevents@2.1.2" is an optional dependency and failed compatibility check. Excluding it from installation. [3/4] Linking dependencies... warning " > sass-loader@7.3.1" has unmet peer dependency "webpack@^3.0.0 || ^4.0.0". warning " > vuetify-loader@1.4.3" has unmet peer dependency "webpack@^4.0.0". [4/4] Building fresh packages... Done in 32.78s. yarn run v1.21.1 $ vue-cli-service build

error Command failed with signal "SIGKILL". info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command. Copying Single page web application to S3: 0 files [00:00, ? files/s] Copying Credential C formatter page to S3: 6 files [00:00, 90.80 files/s]
Copying Online documentation to S3: 446 files [00:12, 34.33 files/s]

Do you have any idea what is the problem? I am not familiar with vue-cli-service at all.

gadams999 commented 4 years ago

I managed to deploy the stack again, it didn't take too much time. Now I am trying to deplay the app (from a Cloud9 environment) but it fails...

Let me run through the deploy script and see what's up.

gadams999 commented 4 years ago

Hi,

I was able to correctly build and deploy via Cloud9. Then I saw that I missed a push to the repo that resolves the CDK 1.20.0 changes. With that I also turned off the error checks for console.* messages (we use these in the workshop to give more details in the Browser).

The deployment steps should work fine and with less errors now. If you could do a git pull to update the local Cloud9 repo and then run the python deploy.py again, let me know what you see.

Oh, and thank you for working with this outside you're normal work hours. We appreciate the feedback and anything to help make the process easier!

gytelek commented 4 years ago

Hi,

I didn't delete the existing stack, I made just a git pull in the Cloud9 environment and then I started python3 deploy_app.py again. I got the same error message again: vue-cli-service build error Command failed with signal "SIGKILL". Should I have destroyed and recreated the cloud formation stack before deploy_app?

gytelek commented 4 years ago

And regarding my free time: I do it with pleasure, I find the combination of embedded devices and AWS very interesting :-) Unfortunately I couldn't finish the workshop in Las Vegas and I would like to complete it. And I can occasionally do something in the office e.g. if I have to wait for the build to be completed ...

gadams999 commented 4 years ago

Hi,

I didn't delete the existing stack, I made just a git pull in the Cloud9 environment and then I started python3 deploy_app.py again. I got the same error message again: vue-cli-service build error Command failed with signal "SIGKILL". Should I have destroyed and recreated the cloud formation stack before deploy_app?

Hmmm, you shouldn't have to do anything special. Could you check the contents of src/aws-exports.js file and ensure all the values look good and are complete? I did this with a default Cloud9 instance (t2.micro, Amazon Linux). Just to ensure there isn't a mismatch someplace, I'd ask if you could delete the stack then recreate from the latest master branch, then try the deploy again. Also, I've seen comments that SIGKILL can be memory related, so you might try a t3.small instance for the additional memory.

We'll get this sorted out.

gytelek commented 4 years ago

Good news: I managed to deploy the application, the website is now available. The problem was that my Cloud9 environment used t3.nano which has only 0.5 GB memory which was not enough for the build process. This is why it has got killed.

Now I can start the workshop again as a participant. I hardly wait playing with the MCU :-)

One note: I had to delete the '[3:]' in line 72 of cdk_app.py because my registered domain doesn't have www. But I think it is a problem with my registration.

Thank you for your support.

gytelek commented 4 years ago

Hi Gavin,

I am back to you again :-( I hoped everything works fine but unfortunately not.

I tried to create a new user: sms identification works fine but the site stays stucked in "Loading resources". I can see the new user in the UserTable in state CREATING but I cannot see any new CloudFormation stack or Cloud9 environment.

gadams999 commented 4 years ago

Check the CloudWatch log for the ApiGetResources function. WARNING messages are fine (artifact on waiting for IAM user to be seen by Cloud9), but is there an ERROR message?

gytelek commented 4 years ago

I found following in the Cloudwatch log, it seems to be relevant:

[WARNING] 2020-01-10T13:54:07.859Z 51c67b90-03df-4228-9ded-66a25b30c06c Error calling iam.get_account_password_policy() (will retry) for user gytelek, error: An error occurred (NoSuchEntity) when calling the GetAccountPasswordPolicy operation: The Password Policy with domain name 502140257954 cannot be found.

Retried many times then timed out.

Website (dispenser_app) contains AWSIoT endpoint for my account but no AWS Console Details, empty private key file, empty amazon root ca1 file.

gytelek commented 4 years ago

Today I deleted the stack and recreated it again. No problem with deploying the stack and application. I assume before doing anything I should login as admin in order to create further resources (e.g. userpool). Unfortunately I couldn't login as admin, it stucked in "loading resources" (last time I forgot to say that I experienced the same problem before trying to create a new user). I can see the admin user in the UserTable and I can see the userpool having the admin user in Cognito. In the CloudWatch logs of ApiGetResources lambda I found following error message:

[ERROR] KeyError: 'custom:dispenserId' Traceback (most recent call last): File "/var/task/get_resources.py", line 52, in handler dispenser_id = event["requestContext"]["authorizer"]["claims"]["custom:dispenserId"]

Should the application send a dispenserid without any dispenser?

gadams999 commented 4 years ago

Ah, for now there is no need to login as the admin user. The intent was to add functionality where admin could manage resources within the app. Could you try registering a new user and see if the resources are created there? If not (hangs on "loading resources"), then the CloudWatch logs for that user would be of interest.

For now I'm going to add logic to the app and API calls to not attempt resource creation for the admin user at this point.

gytelek commented 4 years ago

Hi Gavin,

I experienced the same as before. More details:

[WARNING] 2020-01-15T10:35:35.252Z e4a61eab-e5c5-4072-8298-17dd1d2aeec9 Error calling iam.get_account_password_policy() (will retry) for user cdduser, error: An error occurred (NoSuchEntity) when calling the GetAccountPasswordPolicy operation: The Password Policy with domain name 502140257954 cannot be found.

I can sign in as the new user but AWS Console details are empty (all three entries). AWS IoT details an endpoint which seems to be valid.

gadams999 commented 4 years ago

That error is helpful. You shouldn't need to log in with the admin user to create resources, those are only needed for the users themselves. What errors did you see when trying to create a new user without signing in as admin first?

I see where the problem resides: the iam.get_account_password_policy() call assumes there is an account password policy in place, because I've done that years past in my accounts. I'll create a new issue and address the TODO in the create_resources.py code to check for an existing policy and if one doesn't exist, create one based on the workshop.

In the meantime, you can correct by logging into the console, IAM->Account Settings->Set password policy, then create a policy (the workshop uses min length=6, require symbols=false, require numbers=true, require upper=false, require lower=true, and allow users to change password=false).

I'll get a fix into the code later today that will create a policy if one doesn't exist. If one does exist, we just swap between the existing policy and the one needed to create the users with easy to enter passwords.

gadams999 commented 4 years ago

Code corrected. You'll need to pull the changes then do a cdk deploy to update the stack/Lambda code.

gytelek commented 4 years ago

I deleted and deployed the stack again using the new code. It seems to work more better:

[WARNING] 2020-01-16T09:41:34.540Z d54be7e6-9d48-43b6-bc5e-0b62f6130e57 Error creating Cloud9 environment (will retry) for user cdduser, error: An error occurred (NotFoundException) when calling the CreateEnvironmentEC2 operation: User arn:aws:iam::502140257954:user/cdduser does not exist.

I could login in AWS as the new user and I have the Cloud9 environment. I assume I can now connect the device to AWS. Do I need the new users's environment or can I work as admin? How/Under which account is the new user billed?

gadams999 commented 4 years ago

Glad you've got it working. The Cloud9 warning is to be expected. Once an IAM users is created, it takes a period of time (2-30 seconds) for that user to be available to the Cloud9 service. So in the create steps, we loop on that process until it succeeds.

Yes, at this point, with a user created you can run through the workshop. Since that user and all the resources are under the account where the CloudFormation stack is created, all billing happens there. So it's good to destroy the stack when your are done with the workshop to limit billing (e.g., the storage for the Cloud9 instance, DynamoDB, Cognito, etc.).

For the workshop the admin user is not needed, it's an early artifact for some enhancements. To run the workshop you just need the user account created. If it's just one user, there is no programmatic way to give credits, but you can use your administrative IAM user to edit the DynamoDB table to edit the credit value.

gytelek commented 4 years ago

Hi Gavin,

I took time because I didn't have time for it but today I managed to finish the workshop. The dispenser works fine. I didn't have any problem in the second part (what is/was actually the original workshop). My next goal is to update the firmware on-the-air. Can you please tell me where can I find the code for the dispenser device? I made a short search under device_firmware/demos but I couldn't identify it.

gadams999 commented 4 years ago

Apologies for the delay, I was away from "technology" last week. Glad to see everything working!

the code base we used is located under device_firmware/vendors/espressif/boards/esp32/aws_demos/application_code. This is a reduced version of the overall aws_demos functionality, but I'd start there.

Closing out this issue since everything is working, but feel free to open others as needed.