Closed timja closed 6 years ago
As a workaround I have created a Jenkins Job that executes a Windows batch command on the jenkins node where Visual Studio is installed.
The jenkins job triggers the batch command once a day and works in my environment for several years now.
The batch command looks like this:
set MSPDBSRV_EXE=mspdbsrv.exe set MSPDBSRV_PATH=C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE set PATH=%MSPDBSRV_PATH%;%PATH% set ORIG_BUILD_ID=%BUILD_ID% set BUILD_ID=DoNotKillMe echo stop mspdbsrv.exe %MSPDBSRV_EXE% -stop echo wait 7 sec %windir%\system32\ping.exe -n 7 localhost> nul echo restart mspdbsrv.exe with a shutdowntime of 25 hours start /b %MSPDBSRV_EXE% -start -spawn -shutdowntime 90000 set BUILD_ID=%ORIG_BUILD_ID% set ORIG_BUILD_ID= exit 0
What the batch command does is:
stop the mspdbsrv.exe to free up resources
start mspdbsrv.exe with BUILD_ID=DoNotKillMe and a shutdowntime of 25 hours, that leaks the mspdbsrv process without getting killed and it runs for 25 hours so that other build jobs can use the already running process
What you maybe have to do is to change the Path to mspdbsrv -> set MSPDBSRV_PATH=C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE
Updating the msbuild plugin won't work in our situation. We run into this issue, but we don't have the plugin installed. Rather the issue comes for us in the Final Builder scripts we run via Jenkins that call msbuild.
Then install it. MSBuild will veto all mspdbsrv killing.
set the environment variable
_MSPDBSRV_ENDPOINT_=$JENKINS_COOKIE
(The variable starts and ends with a single '_')
This will lead to separate instance of mspdbsrv being started.
mwinter69, thanks for the pointer.
We couldn't get it working with $JENKINS_COOKIE but managed to correct it by adding the following property via EnvInject prior to kicking off the build
_MSPDBSRV_ENDPOINT_=$BUILD_TAG
This resulted in a separate process being initiated for each build and no conflicts/error.
Edit: Correction due to formatting. Refer below
It is
_MSPDBSRV_ENDPOINT_
(with underlines) not MSPDBSRV_ENDPOINT.
Just realized it myself that it's a formatting issue. If you enclose the word in underlines it will get italicised and the underlines disappear.
We recently re-encountered this on our build network and I did some investigation, here's what I found:
It appears that the veto logic doesn't execute on the slave nodes. Is there something special that has to be done in order for it to be detected and executed there? I don't understand enough about how the remoting logic in Jenkins operates to know the answer to this.
Most of the other work-arounds for this are ones that we cannot easily deploy in our environment. If this is truly the issue, does anyone have an idea what it would take to fix it and how long that would take to carry out?
I spent some more time chasing code and I have a suspicion as to the cause of the issue. In ProcessTree.java, there are two different functions that appear to need information from the master and yet operate in different manners
I think that getVeto() needs to have part of it implemented more like getKillers(), so that it will go to the master for the list. It may be also that the accessor belongs in ProcessTree instead, so that it caches the data and doesn't go back to the master quite as much. Then, I think the veto logic should work properly on both a master and a slave. Unfortuntely, this means a change to Jenkins core and upgrading the full instance to fix the issue instead of just a fix to the plugin itself.
Is there any workaround to this issue, because it completely breaks our usage of Jenkins?
Hi Stefan, refer my comments above. This fixed it for us. Cheers
Little side note: It might not be sufficient to just specify _MSPDBSRV_ENDPOINT_ env variable in order to avoid conflicts. I recommend to additionally also set TMP , TEMP and TEMPDIR to an isolated folder if you plan on invoking MSBUILD in parallel as various plugins for MSBUILD as well as MSBUILD itself will place files there.
Further catch of using _MSPDBSRV_ENDPOINT_ is, that now serialization of parallel builds in the same working directory will break in return, unless you made sure that the tempoary files for the different architectures (e.g. the temporary program database created with the individual object files, and commonly named just e.g. "Debug\vc120.pdb", notice the lack of a prefix for the architecture) are completely isolated as well. Otherwise the different mspdbsrv-instances will now collide accessing the same file.
grillba, walteste Hi there, we've got this issue too, and we followed your suggestions to config the master Jenkins node like this:
Configure system > Environment variables > Add new key value pair below:
KEY: _MSPDBSRV_ENDPOINT_
VALUE: $BUILD_TAG
But we got nothing, the error still raised up on windows slave, could you please explain the solution in detail? Should we set this Key-Value on the slave node? Thanks in advance
@billhoo,
You need to do it at the Job level - Not the system level. Use envinject to add the environment variable
Have a look here for how to use envinject, https://wiki.jenkins.io/display/JENKINS/EnvInject+Plugin
Make sure you follow the "Inject variables as a build step" topic
Regards
Mark
Thanks for the timely reply, we've followed your guide and found that there were already 3 seprated mspdbsvr.exe processes(for test purpose, we've ran 3 jobs on one windows slave concurrently) ran in background, so it seems worked, but unfortunately, one of our job still failed due to C1090 error.
This is the screenshot of EnvInject in each of our 3 Pipeline jobs configuration page,
I don't think there's anything wrong here, do I miss something?
Thanks,
Bill.
Just in case this helps anyone, I was able to fix all problems mentioned so far in this issue and comments by following the recommendations on this blog post:
http://blog.peter-b.co.uk/2017/02/stop-mspdbsrv-from-breaking-ci-build.html
The solution involves
1. Installing the MSBuild plugin ver. 1.26 or higher in Jenkins. Setup for use on the server is optional, only needs to be installed. This stops Jenkins from killing the mspdbsrv process automatically.
2. Using the _MSPDBSRV_ENDPOINT_ environment variable as done in the comment above.
3. Spawning and killing a new specific mspdbsrv instance of the right Visual Studio version at the beginning and end of each job which uses it.
Powershell implementation of the Python solution in the blog (change VS140COMNTOOLS to the version of Visual Studio being used):
# Manually start mspdbsrv so a parallel job's instance isn't used, works because _MSPDBSRV_ENDPOINT_ is set to a unique value # (otherwise results in "Fatal error C1090: PDB API call failed, error code '23'" when one of the builds completes). $mspdbsrv_proc = Start-Process -FilePath "${env:VS140COMNTOOLS}\..\IDE\mspdbsrv.exe" -ArgumentList ('-start','-shutdowntime','-1') -passthru .\{PowershellBuildScriptName}.ps1 # Manually kill mspdbsrv once the build completes using the previously saved process id Stop-Process $mspdbsrv_proc.Id
I had the same problem with parallel builds (eg. running in parallel job A from trunk and job A from branch), I tried the solution with _MSPDBSRV_ENDPOINT_ with value BUILD_TAG and it worked almost for all jobs. In one situation I still had that error. So I replaced BUILD_TAG with JOB_NAME environment variable and suddenly it was fine, for now we are out of problems. If anyone has still the problem with ENDPOINT solution, try to change BUILD_TAG for something else. If you do not allow parallel build in single job, JOB_NAME should be enough, otherwise you can try JOB_NAME + BUILD_NUMBER combination.
Maybe ENDPOINT has some restrictions, but I did not have a time to inspect this deeper. What I know is that the problematic job has the longest name in my Jenkins - approx. 48 characters.
Please can anyone advise me how to set _MSPDBSRV_ENDPOINT_ with value BUILD_TAG in a pipeline declarative script?
I don’t really understand the difference between defining and injecting an environment variable. I could do:
stage('build_VisualStudio') {
environment { _MSPDBSRV_ENDPOINT_=$BUILD_TAG }
etc.
Would that be sufficient or must environment variable injection be done in a different way?
Code changed in jenkins
User: Daniel Beck
Path:
content/_data/changelogs/weekly.yml
http://jenkins-ci.org/commit/jenkins.io/0391fcb9b4c957e9e41fde03409de330a3de571d
Log:
Remove JENKINS-9104 fix from release to unblock it
Code changed in jenkins
User: Daniel Beck
Path:
content/_data/changelogs/weekly.yml
http://jenkins-ci.org/commit/jenkins.io/62409d42a5769cac66337cbd4b5df5754f0e2384
Log:
Merge pull request #1522 from daniel-beck/changelog-2.119-amended
Remove JENKINS-9104 fix from release to unblock it
Compare: https://github.com/jenkins-infra/jenkins.io/compare/58f029c79331...62409d42a576
Code changed in jenkins
User: Jesse Glick
Path:
core/src/main/java/hudson/util/ProcessTree.java
test/src/test/java/hudson/util/ProcessTreeKillerTest.java
http://jenkins-ci.org/commit/jenkins/3465da4764c322baf4fb5b90651ef6b9bcd409fb
Log:
Merge pull request #3419 from dwnusbaum/JENKINS-9104-test-fix
Fix test failure by cleaning up static state after tests
Compare: https://github.com/jenkinsci/jenkins/compare/ddbc4bbce7d3...3465da4764c3
*NOTE:* This service been marked for deprecation: https://developer.github.com/changes/2018-04-25-github-services-deprecation/
Functionality will be removed from GitHub.com on January 31st, 2019.
Jenkins 2.120 contains a fix for the previous problem of the ProcessKillingVeto extension point not working on agents.
I'm occasionally getting this error with the latest versions of Jenkins and all the plugins. It started in the recent months, haven't been a problem for a year before that. The problem seems to have NOT been resolved, or possibly re-emerged.
What can I do, is there a workaround? Sporadic build failures for no reason are super annoying.
Same error with latest Jenkins ver. 2.150.3
The error is aways occured when running two jobs concurrently on the same agent with VS2015:
fatal error C1090: PDB API
billhoo, thanks for the tip! I was running VS 2017 (v141 toolset), but there were indeed two simultaneous jobs! So the workaround is to limit this agent to one job at a time. Pity, as it's a pretty powerful multicore server, but it's better than flaky builds.
vuiletgiraffe, totaly the same, we have many different jobs which use MSVC14 as toolchain, but now we can only perform one build at a time, its a huge waste of mashine resources ;(
Hope it can be truly solved.
Solution is still the same, before invoking `msbuild`, set the following environment variables to something unique:
_MSPDBSRV_ENDPOINT_=TMP= TEMP=$TMP TMPDIR=$TMP
Once you have done that, you can launch as many parallel MSBuild instances as you like, even mixing different msbuild versions or whatever. They will not interfere in any way. Doing that on a regular base with mixed MSVC12, MSVC14 and MSVC15 toolchains on the same machine, and didn't have any issues since.
The "official" fix for this problem (trying not to kill the job scheduler) is plain wrong, and causes massive issues. Mostly because MSBuild itself isn't exactly stable either when using the same job server for multiple parallel builds. And if the builds are using different toolchains, a crash is ensured.
I used ext3h's solution:
we solved it like this in a jenkins github multi-branch setup with jenkinsfiles:
bat """ mkdir tmp set _MSPDBSRV_ENDPOINT_= ${BUILD_TAG} set TMP=${Workspace}\\tmp set TEMP=${Workspace}\\tmp set TMPDIR=${Workspace}\\tmp build.bat """
[Originally duplicated by: JENKINS-24753]
[Originally related to: JENKINS-3105]
I run into errors when using a customized build system which uses Visual Studio's devenv.exe under the hood to compile VisualStudio 2005 projects (with VC++ compiler). When starting two parallel builds with Jenkins (on different code base) the second job will always fail with "Fatal error C1090: PDB API call failed, error code '23' : '(" in exactly the same second the first job finishes processing. Running both jobs outside Jenkins does not produce the error.
This has also been reported for builds executed by MSBuild on the Jenkins user mailing list [1].
I analysed this issue thoroughly and can track the problem down to the usage of mspdbsrv.exe. This program is automatically spawned when building a VisualStudio project. All Visual Studio instances normally share one common pdb-server which shutdown itself after a idle period (standard is 10 minutes). "It ensures access to .pdb files is properly serialized in parallel builds when multiple instances of the compiler try to access the same .pdb file" [2].
I assume that Jenkins does a clean up of its build environment when a automatically started job finishes (like as described at http://wiki.jenkins-ci.org/display/JENKINS/Aborting+a+build). I checked mspbsrv.exe with ProcessExplorer and the process indeed has a variable JENKINS_COOKIE/HUDSON_COOKIE set in its environment if started through Jenkins. Killing mspdbsrv.exe while projects are still connected will break compilation.
Jenkins mustn't kill mspdbsrv.exe to be able to build more than one Visual Studio project at the same time.
–
[1] http://jenkins.361315.n4.nabble.com/MSBuild-fatal-errors-when-build-triggered-by-timer-td385181.html
[2] http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/b1d1bceb-06b6-47ef-a0ea-23ea752e0c4f/
Originally reported by gordin, imported from: Visual studio builds started by Jenkins fail with "Fatal error C1090" because mspdbsrv.exe gets killed