Closed davidjward30 closed 6 years ago
Hm. I'm not sure what could cause such error. My guess is that starting up the out-of-process analyzer service takes more time than expected. When the client (the NsDepCop MsBuild task) receives such error it automatically tries to reconnect a couple of times, waiting more and more between retries, but the longest time it waits is 5 sec. After that it gives up and quits with an exception.
Unfortunately the time to wait between retries is not configurable. The retry intervals are defined in source code, here: https://github.com/realvizu/NsDepCop/blob/master/source/NsDepCop.MsBuildTask/DependencyAnalyzerClient.cs
Let me create a special build for you that reads the retry intervals from config so you can tweak it to see if that solves the problem. I'll let you know when I'm done.
In the meantime if you're willing to experiment yourself then you can download the source and create your own build of the tool with longer retry intervals. Here's how to build the tool: https://github.com/realvizu/NsDepCop/blob/master/Contribute.md
If raising the retry intervals proves to solve the problem then I'll release the build that makes the retry intervals configurable.
Please try this build: https://ci.appveyor.com/project/realvizu/nsdepcop/build/1.7.2.171/artifacts Download the updated nuget package (1.7.2-beta1), put it on a local nuget feed, and update your failing projects to use this version. Please note that it will appear as a prerelease package.
You can try it without modifying your config.nsdepcop files, because the default behavior is to retry the failed call 4 times with these wait intervals: 100ms, 500ms, 1000ms, 5000ms. This should be enough time for the analyzer service to spin up. (The previous version, 1.7.1 had a bug and did not perform the last call after the 5000ms wait time. Now it is fixed.)
If the builds still fail then try to set higher wait intervals by modifying your config.nsdepcop files. Add the AnalyzerServiceCallRetryTimeSpans attribute to the root element, The value should be a comma separated list of wait times between retries (in milliseconds).
E.g. this config waits 100ms, then 1sec, then 10sec:
<NsDepCopConfig AnalyzerServiceCallRetryTimeSpans="100,1000,10000">
Thanks very much. I will try that build. The other related issue we've seen is the inability to delete a working folder on the server. This would suggest a process that has outlived it's msbuild task. Typical example of an error here would look like:
Clean build enabled: removing old files from D:\Agent01\work\52769be47d601787 [16:40:47]Failed to delete file: D:\Agent01\work\52769be47d601787\xxx\Packages\NsDepCop.1.7.1\tools\NsDepCop.Core.dll [16:40:47]Failed to delete file: D:\Agent01\work\52769be47d601787\xxx\Packages\NsDepCop.1.7.1\tools\NsDepCop.MsBuildTask.dll
Some process clearly has the assemblies locked. Next time it occurs I can check what that process is.
Regarding the inability to delete working folders: NsDepCop runs in its own process (NsDepCop.ServiceHost.exe) and shuts down when its parent process (msbuild.exe) shuts down. So if NsDepCop still runs at file deletion time then either its parent process haven't exited yet or the shutdown of NsDepCop has actually started but haven't finished yet. So you should try the following:
Make sure that msbuild.exe really shuts down after the build. There's a feature in msbuild called "node reuse" that does just this: keeps the msbuild.exe process(es) running which in turn keeps NsDepCop process(es) running. So make sure that msbuild node reuse is turned off. It's a bit confusing because there's an msbuild command line switch and also an environment variable that controls this behavior. (https://github.com/Microsoft/msbuild/wiki/MSBuild-Tips-&-Tricks)
If it's not msbuild.exe that keeps NsDepCop running then it can also be a race condition between the shutting down NsDepCop process and the file deletion process. Try to insert some delay (couple of seconds) after msbuild have finished and before the deletions to reserve some time for NsDepCop to finish shutting down.
One more thing: both the NsDepCop exceptions ("Unable to communicate with NsDepCop service") and this file locking problem could be caused by the build server being unusually slow (when starting up and shutting down processes). Could this be the case? Is it maybe low on resources (especially CPU or disk)? So it might also be worth trying to give the build machine some more resources and see if that solves both NsDepCop problems.
Quick update:
This issue came up in my current project too. When NsDepCop analysis takes more than about 4-5 sec it logs a RemotingException and quits with error result.
I've upgraded NsDepCop to 1.7.2-beta1 and set the following retry time spans in the solution-level config.nsdepcop file:
AnalyzerServiceCallRetryTimeSpans="100,1000,5000,10000,30000"
No errors since that (so far).
Note to self: check to makes sure that no resources are leaked during the analyzer service communication.
@davidjward30 Could you give me an update on the status of this? Did the new config (AnalyzerServiceCallRetryTimeSpans) solve the problem?
@realvizu I think that the stability is much improved in 1.7.2-beta1. We're sticking with this version for now.
We have run into another issue which may be present in both versions. I'll report separately.
Thanks.
David.
Getting this with version 1.7.1 on my colleagues machine - do you know when 1.7.2 going to be released?
We tried dropping it to 1.7.0 and it started working ¯_(ツ)_/¯ ...
I was planning to skip 1.7.2 and make 1.8.0 the next release (to save time on testing). But 1.8.0 is taking longer than I expected so probably I should reconsider giving 1.7.2 a proper testing and release it. I'll decide soon.
By going back to 1.7.0 you can avoid this problem but at the cost of a much slower analysis. The new feature of 1.7.1 was to run NsDepCop in its own dedicated process to avoid the cost incurred by MsBuild creating the analyzer repeatedly for each project. But it introduced the problem that MsBuild must wait for this dedicated NsDepCop process to spin up and start listening on a named pipe. And if it takes longer than expected then it fails. 1.7.2 does nothing else but makes the wait time configurable so instead of failing it can wait longer.
Thanks, we'll stick with 1.7.0 for the time being, and you can bring in 1.8.0 at your leisure!
On 18 April 2018 at 08:03, Ferenc Vizkeleti notifications@github.com wrote:
I was planning to skip 1.7.2 and make 1.8.0 the next release (to save time on testing). But 1.8.0 is taking longer than I expected so probably I should reconsider giving 1.7.2 a proper testing and release it. I'll decide soon.
By going back to 1.7.0 you can avoid this problem but at the cost of a much slower analysis. The new feature of 1.7.1 was to run NsDepCop in its own dedicated process to avoid the cost incurred by MsBuild creating the analyzer repeatedly for each project. But it introduced the problem that MsBuild must wait for this dedicated NsDepCop process to spin up and start listening on a named pipe. And if it takes longer than expected then it fails. 1.7.2 does nothing else but makes the wait time configurable so instead of failing it can wait longer.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/realvizu/NsDepCop/issues/29#issuecomment-382284610, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQ0-qG-TKdXmWq1G3GHornBDPITh9P2ks5tpuVOgaJpZM4RkQ_W .
Please try this new version: https://www.nuget.org/packages/NsDepCop/1.8.0-beta1
Closing it with v1.8.0 release. Please reopen if the problem persists.
We frequently get build failures on our CI build agents:
error NSDEPCOPEX: Exception during NsDepCopTask execution: System.Exception: Unable to communicate with NsDepCop service. Exception: System.Runtime.Remoting.RemotingException: Failed to connect to an IPC Port: The system cannot find the file specified.
Running the build again usually succeeds.