Closed brunotag closed 4 years ago
@brunotag So you're saying this is attributable to the version you pushed? Do we need to roll it back or do you have an idea of what to fix it?
Also @brunotag, can you run in verbose mode and send me the output?
I can't replicate it locally (so can't send the verbose output easily), it only happens on certain servers, after some time.
Windows Server 2012 R2, IIS 8.5.
We could turn on verbose mode on such servers but that might take some time.
It seems to affect the "ASP.NET Applications" category only, so I suspect it is related, somehow, to trying to apply perf counters to AppDomain (instances) that don't exist anymore.
On Tue, 6 Oct 2020, 06:06 Seth Schwartzman, notifications@github.com wrote:
Also @brunotag https://github.com/brunotag, can you run in verbose mode and send me the output?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/newrelic/nri-perfmon/issues/23#issuecomment-703764206, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJIYBCXFVDFG7C5LM7RM5DSJH4J7ANCNFSM4R7VBNWQ .
So perhaps if we add some controls to catch such an exception and remove instances that don't exist when that happens?
Yes, the exception is thrown here https://github.com/newrelic/nri-perfmon/blob/master/nri-perfmon/Plugin.cs#L510, and caught here https://github.com/newrelic/nri-perfmon/blob/master/nri-perfmon/Plugin.cs#L522, and the problem is that it clutters the logs.
I can't tell the type of the exception from the logs but I suspect it is an InvalidOperationException thrown by this piece of code from the .NET Framework https://referencesource.microsoft.com/#System/services/monitoring/system/diagnosticts/PerformanceCounterLib.cs,1569 . Again I can't reproduce it :-(
I thought about doing exactly what you are suggesting but I am not sure where to do it: the code where the exception is thrown loops on performance counters, and it "gets" the instance from the performance counter. I don't understand why / where / how it gets instance that don't exist anymore.
On Wed, 7 Oct 2020 at 04:05, Seth Schwartzman notifications@github.com wrote:
So perhaps if we add some controls to catch such an exception and remove instances that don't exist when that happens?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/newrelic/nri-perfmon/issues/23#issuecomment-704334280, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJIYBFK34MFS4MVSSGMUGTSJMW5NANCNFSM4R7VBNWQ .
-- Bruno Tagliapietra bruno.tagliapietra@gmail.com 0226441495 +64226441495
@brunotag I think I have fixed this issue. The code had no place where it would remove stale instances of counters, so to resolve I made that message a VERBOSE one (so not normally seen) and remove the offending counter from future executions. The whole list of counters is repopulated every time anyway so, if the instance reappears, it will show up again. I also reduced down some of the code in the Populate method. Commit: https://github.com/newrelic/nri-perfmon/commit/a07c698fe24a0aa93f62ddde4ecba729255f779d
Here's the release, if you want to try it out. Let me know and hopefully we can close this out. https://github.com/newrelic/nri-perfmon/releases/tag/0.5.2-alpha
@brunotag ever have a chance to look at my fix? Can we close this out?
@sschwartzman I just got the news that the fix seems to work, in the 0.5.2-alpha version, so I think we can close the issue :)
https://github.com/newrelic/nri-perfmon/pull/17/commits/b01decc1d17770153bbb8dfe44da2e58e89dbe75 seems to cause exceptions for the "ASP.NET Applications" category :-(
Only version https://github.com/newrelic/nri-perfmon/releases/tag/0.5.1 seems to be affected.