Closed notatallshaw closed 1 year ago
It is expected, although we can probably filter out a bunch of artifacts (related to #3). The idea is that when we try to gather metadata for a particular version of a package we check if we have any artifact already cached. (Note the cache_mode=OnlyIfCached
). This only queries the disk cache for existence and should be fairly quick. The metadata for all artifacts "should" be the same so this seems like a nice optimization.
Looking at your logging statement it appears it takes 5ms (08.319171
- 08.313789
) to check the cache which seems reasonable. We can probably also do concurrent requests but I'm not sure that will be much faster.
I also noticed you run the debug build, for measuring performance it might be worth it to run the binary in release mode:
cargo r --release -- PyWavelets
With #42 we now only consider wheels that are relevant to the installed python interpreter.
Oh wow, that was fast, I was going to write a quick script to see how much real world performance inpact it had.
While 5ms isn't much on its own, you potentially doing it many thousand of times for a complex backtrack.
While 5ms isn't much on its own, you potentially doing it many thousand of times for a complex backtrack.
Thats true! I imagine that in the future we could simply parallelize these things. Closing for now since #42 has been merged.
This is possibly a misunderstanding on my side but on Linux on Python 3.8 when I run:
I get a lot of output of it collecting wheels related to MacOS and Windows and Python versions that I was not on:
e.g.
Is this expected? It appears to be slowing rip down when the wheels aren't cached and it's having to do a lot of backtracking, I noticed it particularly when trying this spec on Python 3.8 (Windows or Linux):