Closed chmorgan closed 2 months ago
@whmountains thought we could track the performance stuff here.
@whmountains it would be neat if we had these profiling calls built into the library, I'm having to add them here but if we call these functions rarely the overhead of the profiling calls (using time.time() difference here) should be very low.
@whmountains, I'm seeing things take much less time here, M3 MacBook:
/Users/kicad/remoteroot/workspace/build-macos-kicad-release/kicad-mac-builder/build/kicad/src/kicad/pcbnew/action_plugin.cpp(163): assert "PgmOrNull()" failed in register_action().
starting jlcpcbtools standalone mode...
duration 1.013922929763794
cmorgan@MacBook-Pro plugins %
Opening 'select part' is ~1 second.
Sorting takes a handful of seconds.
Thanks for checking into that @chmorgan !
I wonder what could be the source of the speed difference? The only thing I can think of is that your M3 MacBook is definitely faster than my 2020 MBP. Still, a factor of 10 seems like a lot.
Are we talking about the same function? The slowness I am observing is with populate_footprint_list
in mainwindow.py, not populate_part_list
in partselector.py. I'm guessing you already know that but I just wanted to check since you mentioned opening 'select part' which is not something I even benchmarked. I'm sort of grasping at straws here. The big slowdown I saw here was upon closing 'select part'.
diff --git a/mainwindow.py b/mainwindow.py
index 9cfa916..2261925 100644
--- a/mainwindow.py
+++ b/mainwindow.py
@@ -559,6 +559,7 @@ class JLCPCBTools(wx.Dialog):
def populate_footprint_list(self, *_):
"""Populate/Refresh list of footprints."""
+ start = time.time()
if not self.store:
self.init_store()
self.footprint_list.DeleteAllItems()
@@ -626,6 +627,8 @@ class JLCPCBTools(wx.Dialog):
break
self.footprint_list.AppendItem(part)
+ end = time.time()
+ print(f"duration {end - start}");
def OnSortFootprintList(self, e):
"""Set order_by to the clicked column and trigger list refresh."""
Looks like we are benchmarking the same thing with the same technique.
Funny thing happened while testing today. While testing #452 I noticed it was a lot faster and added the timing code. Sure enough populate_footprint_list
was taking exactly 2.00s
. Every single time. (I formatted it to two decimal places so there could have been some slight variation beyond that.)
So I went back to the latest from the official repo, and am seeing over 14s and a spinning beachball again:
starting jlcpcbtools standalone mode...
duration 14.247678995132446
2024-04-25 10:25:55.205 Python[32583:7910929] This application is trying to draw a very large combo box, 30 points tall. Vertically resizable combo boxes are not supported, but it happens that 10.4 and previous drew something that looked kind of sort of okay. The art in 10.5 does not break up in a way that supports that drawing. This application should be revised to stop using large combo boxes. This warning will appear once per app launch.
duration 14.376977920532227
There must be some hidden variable which is causing the huge speed difference.
For consistency's sake, let's define a standard way of testing. I propose the following:
git status
shows no modified files like settings.json
.git apply ../add_timing.patch
rm -r <plugin_dir>/jlcpcb
rm -r <workdir>/jlcpcb
C1002
from the list, search for it if neccesary, and click "select part"After following those steps many times trying out different configurations, I have some observations:
0.0025s
. I can quit and relaunch as many times as I want but this first call is consistently 1-3ms. I suspect this is more related to the standalone mode stubbing than I also sporadically observed:
ANALYZE
on the parts db.jlcpcb
directory but not the global databases.I could not reproduce either of these reliably. But I can say anecdotally that the 13-15s performance band is what I was stuck in on my last real-world project using this tool. Was quite annoying for sure!
In summary I think we can say the following:
populate_footprint_list
is highly variable.I'm currently suspecting disk cache. It makes sense that the DB would be cached after the first query, and a 10x speedup in table scan speed would not be unexpected if the database file happened to be cached in RAM. I have 32Gb of RAM on my MBP so a lot of caching is certainly happening.
Update: I now am quite certain that this is related to the database file getting cached in RAM.
Steps to reproduce on MacOS:
sudo purge
in the terminal. You should see the "cached files" value go down.If disk cache is indeed the issue, it will be hard to reproduce this consistently because it is up to the OS to decide when to cache and when not to cache. On my system, and probably yours as well @chmorgan, the OS has figured out that the parts DB is a hot file and aggressively caches it. On the system of a new user, or on any system with a lot of other apps open, it would be much less likely to get cached.
Something in populate_footprint_list
is forcing a full table scan, but our high-end computers are stepping in and sparing us the inefficiency. I really hope we can figure out the proper optimizations such that it's not necessary to read through 4Gb of data every time a part is selected. And hopefully not at startup either. 🤪
Is this also fixed in #503 ?
@Bouni it sounds like the startup issue is resolved with #503 yes. I'll close it as we know there are downsides to full keyword search performance in particular situations.