deis / builder

Git server and application builder for Deis Workflow
https://deis.com
MIT License
40 stars 41 forks source link

Load test deploying apps #427

Closed arkkanoid closed 8 years ago

arkkanoid commented 8 years ago

Hi, I've done some load tests deploying apps through Git, I deployed 10 apps concurrently. I've found some errors related with the SSH connection but I'm not sure what's the issue:

[ERROR] Failed handshake: EOF
Accepted connection.
---> [ERROR] Failed handshake: EOF
Accepted connection.
---> [ERROR] Failed handshake: ssh: invalid packet length, packet too large

[ERROR] Failed git receive: Failed to run git pre-receive hook:  (signal: broken pipe)
mboersma commented 8 years ago

The only similar error I'm aware of usually derives from ELB configuration: https://deis.com/docs/workflow/managing-workflow/configuring-load-balancers/#idle-connection-timeouts

It's also possible that the load on the controller or builder caused it to miss a livenessCheck, which could cause Kubernetes to stop routing its traffic temporarily.

@arkkanoid is this with Deis Workflow v2.5.0 or earlier? Do you have a test script or steps to help reproduce this error?

arkkanoid commented 8 years ago

The issue is on version v2.4.2. I simply ran 'git push deis branch' on apps with 15MB of data code. I ran it to 10 different apps concurrently. 2/10 deploys had this error. The git push is from an EC2 instance to the Deis ELB. The controller and the builder have 0 restarts.

arkkanoid commented 8 years ago

Complete logs: https://gist.github.com/arkkanoid/c4d00395d412ff6b0b4f29984b69740c

mboersma commented 8 years ago

Just to check, did you increase the AWS ELB timeout to 1200s as recommended in our docs?

arkkanoid commented 8 years ago

Yes, I increased it to 2000 sec.

mboersma commented 8 years ago

@arkkanoid my apologies for not following up here--I was kind of out of ideas.

Is this still reproducible in current Deis Workflow (v2.6.0)? And do you have an example of one of these apps with 15MB of data code I could use to try to test this?

arkkanoid commented 8 years ago

Sorry I solved it few weeks ago. I think it was an error related with the DNS resolution service. I updated it to a new Kubernetes version and it works now.

Thanks!