timja / jenkins-gh-issues-poc-06-18

0 stars 0 forks source link

[JENKINS-7813] Archiving artifacts very slow #6114

Closed timja closed 11 years ago

timja commented 14 years ago

We are running 1.380. I can watch a 300MB tarball get transferred to the Hudson master slowly over 30+ minutes. An scp between the same machines and same files/directories takes less than 3 minutes.


Originally reported by gtuhl, imported from: Archiving artifacts very slow
  • status: Resolved
  • priority: Major
  • resolution: Fixed
  • resolved: 2013-06-05T14:42:38+00:00
  • imported: 2022/01/10
timja commented 14 years ago

davidkarlsen:

Got the same problem - it's been like that for ages...

timja commented 14 years ago

davidkarlsen:

Same type of issue here: http://issues.jenkins-ci.org/browse/JENKINS-3799

timja commented 14 years ago

jsmigel:

Same here. I can see this on 3 different master/slave combinations (linux/linux) using the POM in JENKINS-3922. It usually transfers < 100 bytes/second. I can confirm that it's calling FilePath.copyTo. To make it as simple as possible, I've been testing with an SSH slave on localhost.

I could not reproduce this on OS X 10.6 using the same setup and POM.

I am using 1.381 in all cases, but davidkarsen's right - this problem's been around way longer than that.

timja commented 13 years ago

salimfadhley:

I'm using 1.359 on Windows XP 32bit.

I also get this - typically artifacts < 5mb take about 15 minutes to archive.

timja commented 13 years ago

salimfadhley:

I'm watching the build's archive directory as the files get collected.

Typically what happens is that the file streams in very slowly (at about 3-15k per second) and then abruptly stops when 800-1500k of data has been copied.

Sometimes when I kill the process it's recorded as a "Blue" even thouh it's actually an abort.

Any suggestions as to how I can work-around this bug would be much appreciated.

Sal

timja commented 13 years ago

salimfadhley:

I got this exception... is it relevant? Note that despite being a failure this is reported as SUCCESS.

C:\hudson1\workspace\build_calyon_egg\integration_src>exit 0 Archiving artifacts
ERROR: Failed to archive artifacts: out/* hudson.util.IOException2: hudson.util.IOException2: Failed to extract c:\hudson1\workspace\build_calyon_egg/out/* at hudson.FilePath.readFromTar(FilePath.java:1577) at hudson.FilePath.copyRecursiveTo(FilePath.java:1491) at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:117) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601) at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:580) at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:558) at hudson.model.Build$RunnerImpl.post2(Build.java:158) at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:528) at hudson.model.Run.run(Run.java:1264) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:122) Caused by: java.io.IOException at hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:173) at hudson.util.HeadBufferingStream.read(HeadBufferingStream.java:61) at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:221) at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:141) at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:92) at org.apache.tools.tar.TarBuffer.readBlock(TarBuffer.java:257) at org.apache.tools.tar.TarBuffer.readRecord(TarBuffer.java:223) at hudson.org.apache.tools.tar.TarInputStream.read(TarInputStream.java:345) at java.io.FilterInputStream.read(FilterInputStream.java:90) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1025) at org.apache.commons.io.IOUtils.copy(IOUtils.java:999) at hudson.util.IOUtils.copy(IOUtils.java:29) at hudson.FilePath.readFromTar(FilePath.java:1565) ... 12 more at hudson.FilePath.copyRecursiveTo(FilePath.java:1498) at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:117) at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19) at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:601) at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:580) at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:558) at hudson.model.Build$RunnerImpl.post2(Build.java:158) at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:528) at hudson.model.Run.run(Run.java:1264) at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46) at hudson.model.ResourceController.execute(ResourceController.java:88) at hudson.model.Executor.run(Executor.java:122) Caused by: java.util.concurrent.ExecutionException: hudson.remoting.RequestAbortedException: java.net.SocketTimeoutException: Read timed out at hudson.remoting.Request$1.get(Request.java:236) at hudson.remoting.Request$1.get(Request.java:172) at hudson.remoting.FutureAdapter.get(FutureAdapter.java:59) at hudson.FilePath.copyRecursiveTo(FilePath.java:1494) ... 11 more Caused by: hudson.remoting.RequestAbortedException: java.net.SocketTimeoutException: Read timed out at hudson.remoting.Request.abort(Request.java:257) at hudson.remoting.Channel.terminate(Channel.java:598) at hudson.remoting.Channel$ReaderThread.run(Channel.java:880) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at java.io.ObjectInputStream$PeekInputStream.read(ObjectInputStream.java:2266) at java.io.ObjectInputStream$BlockDataInputStream.read(ObjectInputStream.java:2668) at java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2698) at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1648) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1323) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) at hudson.remoting.Channel$ReaderThread.run(Channel.java:856) Recording test results
Recording fingerprints
Triggering a new build of publish_calyon_egg #2956 Notifying upstream projects of job completion Finished: SUCCESS 
timja commented 13 years ago

xan:

Same problem here with Jenkins v1.404

timja commented 12 years ago

mpontillo:

Anyone have a workaround? I was thinking of just doing a direct 'scp' to the master. FYI, I get the following stack trace if I abort the build while archiving artifacts is taking a long time:

Archiving artifacts
ERROR: Failed to archive artifacts: */.tar.gz
java.io.IOException
at hudson.remoting.FastPipedInputStream.read(FastPipedInputStream.java:175)
at hudson.util.HeadBufferingStream.read(HeadBufferingStream.java:61)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at hudson.util.HeadBufferingStream.fillSide(HeadBufferingStream.java:83)
at hudson.FilePath$TarCompression$2.extract(FilePath.java:568)
at hudson.FilePath.copyRecursiveTo(FilePath.java:1685)
at hudson.tasks.ArtifactArchiver.perform(ArtifactArchiver.java:116)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:703)
at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:678)
at hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:656)
at hudson.model.Build$RunnerImpl.post2(Build.java:162)
at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:625)
at hudson.model.Run.run(Run.java:1435)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:238)

timja commented 12 years ago

mpontillo:

My workaround: I added a conditional SFTP to the Jenkins server inside my build script (depending on the status of the Jenkins environment variables). It's a hack.. I wish this worked normally.

timja commented 11 years ago

lkeagle:

I see this same slowness when archiving artifacts just on the master. Final output of our build sequence generates ISO images with our installers and tools, and archiving these files takes ages. No master/slave setup, same hard disk...

Agonizingly slow. This would take only a minute or two if I were fingerprinting and copying the files manually.

timja commented 11 years ago

jglick:

@lkeagle’s observation suggests that JENKINS-16301 might be part of the issue (but running a profiler is the best way to be sure).

timja commented 11 years ago

scm_issue_link:

Code changed in jenkins
User: Kohsuke Kawaguchi
Path:
changelog.html
core/pom.xml
core/src/main/java/hudson/tasks/ArtifactArchiver.java
pom.xml
war/pom.xml
http://jenkins-ci.org/commit/jenkins/8a3e909d266ea59d7179fef21350e9aec1f4b245
Log:
[FIXED JENKINS-7813]

Fixed the throughput problem between master/slave communication.
This fix contains two independent problems.

One was in the remoting. During a large sustained data transfer
(such as artifact archiving and large test reports), the way we
were doing flow control and ACK-ing were penalizing us badly.
I improved the flow control algorithm in remoting 1.23, and also
increased advertised window size so that the transfer can saturate
available bandwidth even when a latency is large. (And unless
the reader side is excessivesly slow, this shouldn't increase
any memory consumption.)

The other fix was in trilead-ssh2, which is our SSH client
implementation used by ssh-slaves plugin. The buffer size for flow
control was too small. I improved the way buffering is done to reduce
the memory footprint when the reader closely follows the writer, then I
increased the advertised window size. Again, this shouldn't increase
memory consumption (in fact it'll likely actually reduce them) unless
the reader end gets abandoned.

On my simulated latency-injected network, the sustained transfer rate is
now on par with scp. We win for smaller files because of the TCP slow
start penality that scp would incur, and we lose a bit as files get
larger due to additional framing overhead.

If you have manually extracted slave.jar and placed them on slaves, you
need to update them to 2.23 to see the performance benefits.


You received this message because you are subscribed to the Google Groups "Jenkins Commits" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-commits+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

timja commented 11 years ago

dogfood:

Integrated in jenkins_main_trunk #2393
[FIXED JENKINS-7813] (Revision 8a3e909d266ea59d7179fef21350e9aec1f4b245)

Result = SUCCESS
kohsuke : 8a3e909d266ea59d7179fef21350e9aec1f4b245
Files :

timja commented 11 years ago

tomfanning:

Thank you very much! Look forward to seeing this in a release.

timja commented 11 years ago

davidkarlsen2:

Yes - hope it will be backported into the LTS real soon

timja commented 11 years ago

pancake:

FYI: updating SSH Slaves plugin to 0.23 resulted in problem described in JENKINS-17388 for us.

timja commented 11 years ago

stephenconnolly:

@pancake: you do know that the warning is just a warning and not an actual problem. It was just some debugging code and when it is printed it means that you will not see the performance improvements but otherwise everything will be fine. i.e. cosmetic

timja commented 11 years ago

pancake:

@stephenconnolly: thanks!
What do we need to upgrade to get those performance improvements? Is it Jenkins itself? Will the optimization be merged into LTS? Is there a way to find out such thing ourselves (from GitHug Commit History?) instead of bothering developers?

timja commented 11 years ago

jglick:

@tomfanning, @davidkarlsen2, @pancake: as http://jenkins-ci.org/changelog notes, these changes are in Jenkins 1.509. That will be the basis of the next LTS.

timja commented 11 years ago

emszpak:

I see no changes after upgrade to 1.515 (with slave upgraded to 2.23). 43MB WAR archives in 102 seconds vs. <2 seconds with SCP. No one confirmed on the mailing list that was able to get significant change after upgrade.

timja commented 11 years ago

jglick:

@emszpak: reclosing since the intended fixes were delivered. There may be other as-yet-undiagnosed problems slowing down your connection; someone would need to reproduce the issue, file separately (can use JIRA links), and analyze it.

timja commented 11 years ago

salsolatragus:

Not sure about the connection, but same symptoms.

timja commented 11 years ago

sodul:

I can confirm the same issue. The files take 10s to publish to our maven repo at the end of the build but before that they take over 4m to publish back to the master. I initially blamed a slow VM as the existing master but after upgrading to fresh hardware today and switching from CentOS to Ubuntu 12.04 we are still getting the issue. We are running Jenkins 1.527 with Java 1.7.0_17 (not open jdk).

timja commented 2 years ago

[Originally duplicated by: JENKINS-7921]

timja commented 2 years ago

[Originally duplicated by: JENKINS-13945]

timja commented 2 years ago

[Originally related to: JENKINS-18276]

timja commented 2 years ago

[Originally related to: JENKINS-3922]

timja commented 2 years ago

[Originally related to: JENKINS-3799]

timja commented 2 years ago

[Originally related to: JENKINS-12007]

timja commented 2 years ago

[Originally related to: JENKINS-17236]