Closed chapmandu closed 1 year ago
Yeah, there's a perf overhead; but I don't see 400+ percent, I run like 20% which isn't great but enough to not really be noticeable for my dev work.
I have a test suite putting out numbers over 4 runs like
plain with debugger running (vscode not attached)
270,128ms 333,610ms
286,091ms 324,133ms
276,942ms 322,505ms
275,889ms 327,482ms
------ -------
avg 277,262.5ms 326,932.5ms overhead 18%
I wonder what the difference in our setups is. My test suite is slow all on its own because it's entirely single threaded -- maybe you are highly concurrent and so are experiencing heavy contention on some ConcurrentMaps that doesn't happen over here.
Our test suite makes (synchronous) requests to an endpoint for each test package. So lots of small requests rather than one long one.
If you get a chance, can you take a flight recording, with a focus on method profiling and GC behavior, and report any oddities? Maybe it's stuck on a lock somewhere, or it's behaving badly against your GC configuration. I can't currently reproduce the magnitude of overhead you are experiencing.
I've been trying this out as well and was am super excited but noticed a big performance degradation. A request that would take 4 seconds i now taking 30 seconds in a 4 core VM.
Is there anything I can provide to help with debugging this issue?
@Symbianx can you take a perf trace via JMC or a similar tool? I am curious where it's spending its time. What platform are you on -- you guys both on Windows?
I really should have given these details in my initial post:
Ubuntu Linux Lucee 5.3.10.97 Commandbox Docker Container
Neat, that's almost my setup, too. That rules out the "it's some weird Windows phenomenon" theory.
You might find something interesting with JMC
I get a good connection to it spawning Java with
-Dcom.sun.management.jmxremote.port=<some-port>
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
I don't have any good theories on a particular culprit at the moment. Do the workloads involve spawning thousands of threads which are all expected to complete very quickly?
It will probably be a few days before I get to tinker with it again, but my application doesn't use any cfthread or async functions. FWIW I use the cfwheels framework. I'd be interested to hear what setup @Symbianx uses, perhaps there is a common factor?
In our case, we're using the commandbox docker image based on alpine. This runs together with coldbox as the framework.
~The host it runs on was already running with few resources so I'm wondering if this could be a garbage collection loop due to a memory leak on luceedebug.~ I freed some resources and the problem still happens.
I'm gonna try to get something out of JMC
Alright, I bring some more info! I ran JMC and one of the things it mentions is that 812000 exceptions were thrown in a 1 min period, during the execution of the 30s requests.
Looks like the majority of the exceptions were thrown by DebugFrame luceedebug.coreinject.DebugManager.pushCfFrame_worker(PageContext, String, int)
. Please see the attached screenshot.
Due to the nature of the software, I prefer to not share the full JMC recording file but let me know if there's more info you would like me to share.
I think the one having "session scope is not enabled" can be addressed. The one having the null
message, is there any additional info there, like a file or line it is always thrown from?
The session scope exceptions probably worked around w/
- this.session = getScopeOrNull(() -> pageContext.sessionScope());
+ this.session = null;
Indeed, you are correct about the session scope is not enabled
one, was able to track it down to https://github.com/softwareCobbler/luceedebug/blob/master/luceedebug/src/main/java/luceedebug/coreinject/DebugFrame.java#L97
Was trying to do the same backtrack to the null
one but it also points towards the same line in DebugFrame.java
. This is my first time using JMC though, I could be doing something wrong.
Strong candidates are tryGetFrameName
and all the similar invocations of getScopeOrNull
. I don't think too much else can throw (so frequently) from this area.
Alright, changing this line to null
as you mentioned fixed the slow performance and the debugger still works. 🎉
Should I make a PR to set it to null or will this cause issues with something else?
Oh that's good to hear; thanks for looking into it!
The only behavioral side effect of setting this to null is that session
won't be inspectable in the debugger, which if your session scope is disabled, is to be expected.
I'd accept a PR, but I think the more general behavior would have to be:
session
oncesession
(on the assumption that it's not going to magically get enabled later)I found a method to determine wether session management is enabled or not. (Check #39) However I'm not able to test it since we have it disabled.
@Symbianx I ran it over here, and continue to have session scope in debug output, so it seems like it does what it says. I wonder if your PR solves @chapmandu's slowness, too?
I'll try to reinstall and test today.. interestingly I also have sessions disabled. Good work!
I've run our test suite with the new jar & without the jar installed. There is now about a 5% overhead which I am very happy to live with.. Well done!!
It appears that having the luceedebug.jar installed has a significant performance overhead.
The results below is a snippet of some integration test output, with and without the luceedebug.jar loaded via java args.
Note that I don't have the step debugger running in vscode, it's simply installed, and running my test suite.
I'm happy to live with a bit of overhead, but my test suite's execution time goes from 6 mins to 28 mins.
I concede that the "listings" controller does a fair bit of work, but the "mfadevices" controller really does not. Interestingly, the memory and CPU don't really run as high as I would have expected in relation to the response time.
Without luceedebug.jar loaded
With luceedebug.jar loaded
I'm happy to help wherever I can.