awslabs / aws-device-farm-jenkins-plugin

Jenkins plugin for AWS Device Farm.
Apache License 2.0
89 stars 66 forks source link

Fix file descriptor leak when archiving test results #92

Closed stefanverhoeff closed 5 years ago

stefanverhoeff commented 5 years ago

The process of copying the remote test results files to Jenkins for archiving leaves a file stream open. This causes a build-up of open file descriptors on the Jenkins Master which will eventually crash a unix system depending on ulimit 'open files' setting.

This fix replaces the inline copying of files with the build-in FilePath.copyFrom() method which closes the streams correctly.

This patch is running in our Production Jenkins currently and solved our issue with build-up of open files.

To illustrate, previous we saw output like this on the Jenkins Master:

bash-4.2$ lsof 2>/dev/null|sed 's#.*/var#/var#'|grep '/var/lib/jenkins/jobs'|head
/var/lib/jenkins/jobs/dispatcher/jobs/heredriver-automation/jobs/AWS_testing_jobs/jobs/regression_android_tests/builds/.57/archive/AWS Device Farm Results/Google Pixel 128 GB-7.1.2/Street Hailing E2E/tests.e2e_tests.StreetHailingTest.streetHailingTest/Appium Java Output-00001.txt (deleted)
/var/lib/jenkins/jobs/dispatcher/jobs/heredriver-automation/jobs/AWS_testing_jobs/jobs/regression_android_tests/builds/.55/archive/AWS Device Farm Results/Google Pixel 128 GB-7.1.2/Empty Destination Manual Ride Request test/tests.ride_request_tests.EmptyDestinationManualRideRequestTest.emptyDestinationManualRideRequestTest/Appium Java XML Output-00001.xml (deleted)
/var/lib/jenkins/jobs/dispatcher/jobs/heredriver-automation/jobs/AWS_testing_jobs/jobs/regression_android_tests/builds/.57/archive/AWS Device Farm Results/Google Pixel 128 GB-7.1.2/Teardown Suite/Teardown Test/Logcat-00000.logcat (deleted)
/var/lib/jenkins/jobs/dispatcher/jobs/heredriver-automation/jobs/AWS_testing_jobs/jobs/regression_android_tests/builds/.57/archive/AWS Device Farm Results/Google Pixel 128 GB-7.1.2/Login tests/tests.LoginTest.loginPositiveFlow/TCP dump log-00006.txt (deleted)
/var/lib/jenkins/jobs/dispatcher/jobs/heredriver-automation/jobs/AWS_testing_jobs/jobs/regression_android_tests/builds/.60/archive/AWS Device Farm Results/Google Pixel 128 GB-7.1.2/Reject Manual Ride Request test/tests.ride_request_tests.RejectManualRideRequestTest.rejectManualTripTest/Logcat-00004.logcat (deleted)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

paulo-raca commented 5 years ago

This problem has been affecting me too, and the fix seems very straightforward.

Is there anything holding this?

stefanverhoeff commented 5 years ago

Yep maybe @nikhil-dabhade or @anurag1408 can help getting this merged? ;-)

nikhil-dabhade commented 5 years ago

@stefanverhoeff @paulo-raca We are testing this right now. Shooting to have this merged and released before end of next week.

nikhil-dabhade commented 5 years ago

And first and foremost, thank you for the contribution :)

stefanverhoeff commented 5 years ago

@stefanverhoeff @paulo-raca We are testing this right now. Shooting to have this merged and released before end of next week.

Great! Please have a look at PR #90 and #91 too ;-)

nikhil-dabhade commented 5 years ago

@stefanverhoeff We merged #91 . Merging this request now. Again, thank you for your contributions.

stefanverhoeff commented 5 years ago

Thanks for the follow-ups @nikhil-dabhade and @piyushag-aws :-)