hashicorp / packer

Packer is a tool for creating identical machine images for multiple platforms from a single source configuration.
http://www.packer.io
Other
15.14k stars 3.33k forks source link

With Packer, received 400 bad request while uploading box to Vagrant Cloud #10537

Closed berchev closed 3 years ago

berchev commented 3 years ago

Overview of the Issue

I tried to upload Vagrant Box (generated with Packer) to Vagrant Cloud, but I am hitting a bunch of 400 Bad Request errors, using Packer 1.6.6

2021/01/27 21:06:02 packer2 plugin: &{Status:400 Bad Request StatusCode:400 Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[Content-Type:[applica
tion/xml] Date:[Wed, 27 Jan 2021 19:06:01 GMT] Server:[AmazonS3] X-Amz-Id-2:[H3Yo9s46lks+KFLKePM17+V84AagYCTFWVhihuLVNDCh3yk7ZRzNuZcgio9V4ecNtDZuQUZ/
VRc=] X-Amz-Request-Id:[1A481C50F5906B3E]] Body:0xc00050b140 ContentLength:-1 TransferEncoding:[chunked] Close:true Uncompressed:false Trailer:map[] 
Request:0xc000308500 TLS:0xc000f3a000}
2021/01/27 21:06:02 packer2 plugin: bad HTTP status: 400
2021/01/27 21:06:02 packer2 plugin: Retryable error: bad HTTP status: 400
    null (vagrant-cloud): Error uploading box! Will retry in 10 seconds. Status: 400

Using Packer 1.6.5 the upload is successful:

2021/01/27 22:08:52 packer2 plugin: &{Status:200 OK StatusCode:200 Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[Cache-Control:[max-age=0, priv
ate, must-revalidate] Connection:[keep-alive] Content-Type:[application/json; charset=utf-8] Date:[Wed, 27 Jan 2021 20:08:51 GMT] Etag:[W/"b7cfcdf6f3
c06d8bd4c4751b474cfbec"] Referrer-Policy:[strict-origin-when-cross-origin] Server:[Cowboy] Set-Cookie:[_atlas_session_data=aHZiVHRsQWdEcFd5VFBnU0ZGVVNlYzlYVkRDNnpOUFlLQzNZN0UvOG8rOUsxcUZENVFaM0FMN0g4eGRvTUlCcysrZDhuYWtrcFVZdkRJUGFNUHBsTWc9PS0tc2NYNUwyOUZaa252YzZJUVk5ellYQT09--6cf964114ca55286653a22fc4875259bd952f080; path=/; expires=Fri, 26 Feb 2021 20:08:52 GMT; secure; HttpOnly] Strict-Transport-Security:[max-age=31536000; includeSubDomains; preload] Via:[1.1 vegur] X-Content-Type-Options:[nosniff] X-Download-Options:[noopen] X-Frame-Options:[SAMEORIGIN] X-Permitted-Cross-Domain-Policies:[none] X-Request-Id:[554874c0-fba8-4772-b956-1f0f8af23b6a] X-Runtime:[0.294514] X-Vagrantcloud-Rate-Limit:[99/100] X-Xss-Protection:[1; mode=block]] Body:0xc00063e040 ContentLength:-1 TransferEncoding:[chunked] Close:false Uncompressed:false Trailer:map[] Request:0xc000b28000 TLS:0xc000590c60}
    null (vagrant-cloud): Version successfully released and available
2021/01/27 22:08:52 [INFO] (telemetry) ending vagrant-cloud
==> Wait completed after 51 minutes 31 seconds
Build 'null' finished after 51 minutes 31 seconds.

Assumptions

Maybe this is happening when the file is too big. The upload fails for me when I try to upload box 7.5 GB

Uploading small box (900MB), actually works with my xenial.json template using Packer 1.6.6

xenial.json and the successful upload log are provided below as a gist, in section Log Fragments and crash.log files

Reproduction Steps

Packer version

Packer 1.6.6 Vagrant 2.2.14

Simplified Packer Buildfile

My packer.json template. Since not too long, I will paste it here:

{
   "builders":[
      {
         "type":"null",
         "communicator":"none"
      }
   ],
   "post-processors":[
      [
         {
            "type":"artifice",
            "files":[
               "ovf-files/fedora31-kubernetes-disk1.vmdk",
               "ovf-files/fedora31-kubernetes.ovf"
            ]
         },
         {
            "type":"vagrant",
            "keep_input_artifact":true,
            "output":"fedora.box",
            "provider_override":"vmware"
         },
         {
            "type":"vagrant-cloud",
            "box_tag":"{{user `box_tag`}}",
            "access_token":"{{user `cloud_token`}}",
            "version":"{{user `version`}}",
            "version_description":"{{user `version_description`}}",
            "no_release":"{{user `no_release`}}"
         }
      ]
   ],
   "variables":{
      "cloud_token":"{{env `VAGRANT_CLOUD_TOKEN`}}",
      "box_tag":"berchev/vault64",
      "version":"0.6",
      "version_description":"just a test delete it later",
      "no_release":"false"
   }
}

Operating system and Environment details

I am using MacOS, but I believe that can be hit from any OS

Log Fragments and crash.log files

attaching some gist files:

nywilken commented 3 years ago

Hi there @berchev thanks for reaching out. Looking at the logs for 1.6.6 it is possible that the upload of a large file is taking much longer to complete and get persisted onto the vagrant cloud backend storage. You can see that the 400 errors are retryable so its possible that we need to retry a bit more.

There was a change in v1.6.6 to how vagrant boxes are uploaded. Instead of using the API to upload, boxes are now uploaded directly to the Vagrant Cloud backend storage which looks to be S3.

Do you run into issues if you add "no_direct_upload": true to you configuration to disable the direct upload feature?

berchev commented 3 years ago

Hi @nywilken Thank you for the quick response!

I have just tested with "no_direct_upload": true option set to vagrant-cloud post-processor and can confirm that upload is successful

yusukemasuda commented 3 years ago

Hi @nywilken

Nice to meet to you, I'm the person who originally inquired about this issue to @berchev. I also tried building my packer template with 'no_direct_upload' option, and it was successfully completed build and upload. I could workaround this issue for now. However I hope this issue will be fixed.

Because I could not know when it would fail. It probably may depend box size, and internet speed, it sometimes would fail, at other time not. Then I have no choice but to add 'no_direct_upload' option at all times. It doesn't make sense that the option being.

I would greatly appreciate if you can give our proposal a good review.

@berchev Thank you very much for your kindly cooperation.

nqb commented 3 years ago

Hello,

I want to mention that I report that issue to HashiCorp support. They provide my same solution ('no_direct_upload': true) and it works.

chrisroberts commented 3 years ago

Hi everyone. There have been a number of updates made to Vagrant Cloud which should resolve the box upload issue that was resulting in 400 errors. By default, when the no_direct_upload value is false (which is the default) the box asset will be uploaded directly to the backing asset storage. The TTL on the upload links was set too low to properly allow for retries which was resulting in the errors. This has been resolved and uploads directly to asset storage should be working as expected. Setting the no_direct_upload to true will force the upload to be proxied through Vagrant Cloud but will result in slower uploads.

If any other issues are encountered with uploads to Vagrant Cloud, please feel free to open an issue in the hashicorp/vagrant repository or send an email to support

@nywilken If you need anything else related to this issue, just let me know :slightly_smiling_face:

SwampDragons commented 3 years ago

That's awesome, Chris -- thanks. I'll close this, and we can reopen if we see users still struggling.

nywilken commented 3 years ago

Thanks @chrisroberts

chrisroberts commented 3 years ago

Just a followup that we found an issue with direct uploads related to the size of the generated box asset. The modifications in #10820 resolve that issue.

ghost commented 3 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.