juju / charmstore

The charm store server.
http://gopkg.in/juju/charmstore.v5
GNU Affero General Public License v3.0
15 stars 37 forks source link

multipart upload fails sporadically #834

Open rogpeppe opened 5 years ago

rogpeppe commented 5 years ago

When uploading a multipart resource, we occasionally see a 411 response followed by the upload failing.

Here's a client-side log of the problem:

resource-multipart-upload-error-2018-09-20.txt

A representative error log line from the squid proxy is:

10.25.8.155 - - [20/Sep/2018:09:27:43 +0000] "GET http://api.jujucharms.com/v5/upload/W6NcGaoVJQPydLbX HTTP/0.0" 411 7904 "-" "-" NONE:HIER_NONE

The body of the response was: squid-error-2018-09-20.txt

jcsackett commented 5 years ago

I'm actually still seeing this when trying to push resources to the charmstore.

jc@alice ~> charm attach ~yellow/bionic/jujushell-11 limited-termserver=~/limited-termserver.tar.gz
resuming previous upload
~/limited-termserver.tar.gz                     0%      0KiB
ERROR can't upload resource: unexpected response status from server: 411 Length Required

Also, https://github.com/juju/charmstore/issues/849 seems to be a dupe of this, showing others having this problem.

jcsackett commented 5 years ago

The 411 on resume started after the upload failed the first time:

jc@alice ~> charm attach ~yellow/bionic/jujushell-11 limited-termserver=~/limited-termserver.tar.gz
~/limited-termserver.tar.gz                    29%  105.0MiB
ERROR can't upload resource: cannot upload part "e0d59aa3985a831b0506d22558b6a183be478f7d789516f8bd41e8fddcb4f5724775788b9288d775f634b9315188d400": failed to PUT object e0d59aa3985a831b-7b00dbb1739fce52 from container charmstore-blobs
caused by: failed executing the request http://10.24.0.23:8080/v1/AUTH_acf64c68bc7349c98dec46b7e0e72d9f/charmstore-blobs/e0d59aa3985a831b-7b00dbb1739fce52
jcsackett commented 5 years ago

Reattempting after clearing the charm upload cache resulted in

ERROR can't upload resource: cannot upload part "64f0b1ee8fe1ef7a7b9676c3e11901e0bde3022f8b79205ad2f50f42254d997004f86b4191a99451babf25ad7720966c": failed to PUT object 64f0b1ee8fe1ef7a-aa00dbb1739fce52 from container charmstore-blobs
caused by: failed executing the request http://10.24.0.23:8080/v1/AUTH_acf64c68bc7349c98dec46b7e0e72d9f/charmstore-blobs/64f0b1ee8fe1ef7a-aa00dbb1739fce52
caused by: Put http://10.24.0.23:8080/v1/AUTH_acf64c68bc7349c98dec46b7e0e72d9f/charmstore-blobs/64f0b1ee8fe1ef7a-aa00dbb1739fce52: read tcp 10.25.10.39:53152->10.24.0.23:8080: read: connection reset by peer

Resuming upload reliably triggered 411. Additionally the 411 occurred when re-attempting after cancelling charm-attach with ctrl+c.

After several attempts via clearing the cache and retrying, I was able to upload, so there is a (painful) workaround.

mhilton commented 5 years ago

To be clear here the 411 error here comes from the charmstore to swift link. It is possible that #854 will alleviate this.

rogpeppe commented 5 years ago

The most salient error message seems to be this one:

caused by: Put http://10.24.0.23:8080/v1/AUTH_acf64c68bc7349c98dec46b7e0e72d9f/charmstore-blobs/64f0b1ee8fe1ef7a-aa00dbb1739fce52: read tcp 10.25.10.39:53152->10.24.0.23:8080: read: connection reset by peer

which indicates that there's some kind of transient error trying to upload to Swift. We should add a retry loop inside the charmstore blobstore code for this kind of situation.