Closed diegodoimo closed 2 years ago
Merging #72 (17c7f24) into main (f290feb) will decrease coverage by
1.81%
. The diff coverage is76.19%
.
@@ Coverage Diff @@
## main #72 +/- ##
==========================================
- Coverage 80.61% 78.79% -1.82%
==========================================
Files 10 10
Lines 1145 1160 +15
==========================================
- Hits 923 914 -9
- Misses 222 246 +24
Impacted Files | Coverage Δ | |
---|---|---|
dadapy/density_estimation.py | 76.29% <75.00%> (-0.40%) |
:arrow_down: |
dadapy/_utils/density_estimation.py | 54.83% <76.47%> (-34.96%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update f290feb...17c7f24. Read the comment docs.
Thanks @diegodoimo this is a change that me and @imacocco had in mind for a long time. Apart from the small code improvements suggested I have the following curiosity: for small datasets do we see a near zero improvement or do we actually see worse results using the full Cython implementation?
Better in any case. It can be easily tested as I left the flag 'optimized' to select the original implementation, when set to false. (I left it as I didn't want to remove a big piece of the code until we are sure the 'optimized' one can be fine)
I added a ''faster'' version of pak, vectorizing all the for loops and adding a cython version to optimize the likelihood of the full dataset and not just that of a single point. Unfortunately, the performance improvement can be appreciated only for relatively large data sizes:
15% faster for 90k points 50% faster for 250k points