Closed j-c-c closed 1 year ago
Was able to reproduce the failing test on decaf
. Here is the list of particle centers that were found by apple_picker.process_micrograph_centers()
that do not match the reference list in the test:
(1321.0, 361.0)
(2793.0, 1239.0)
(1741.0, 1391.0)
(2703.0, 1737.0)
(861.0, 1839.0)
(965.0, 2305.0)
(749.0, 2473.0)
(699.0, 2627.0)
(2153.0, 2741.0)
(599.0, 2943.0)
(1247.0, 3039.0)
(2037.0, 3281.0)
(2501.0, 3493.0)
Each of these is off by two pixels (from a reference center) in either the x or y coordinate.
At this point, we may want to consider replacing these (hardcoded) tests with simulated data (put a couple of particle projections onto a micrograph and add noise). This way we can control the noise level, etc.
It appears the culprit here is a change to np.argsort()
that came with the release of NumPy 1.25.
The divergent behavior occurs only on Linux in this line of code:
idx = np.argsort(-np.reshape(score, (np.prod(score.shape)), "F"))
So far I have confirmed that the divergent behavior only occurs for the default kind
of sorting, ie. kind='quicksort'
, the other 3 sorting methods ('mergesort', 'stable', heapsort') maintain behavior between NumPy versions.
A solution here might be to switch to a more stable sorting method. I will confirm if the other sorting methods produce reasonable results for test_appple.py
Good sleuthing! The question remains why quicksort changes behavior here (and only for Linux). From what I understand, all sorting algorithms are deterministic, so they should all give the same result. Maybe a bug has been introduced in NP?
I don't think all sorting algorithms are deterministic. That is why they mention stable ones....
(mainly with regard to what happens on ties)
My understanding was that stability just meant that all stable algorithms broke ties the same way. If we stick to the same algorithm and run it multiple times, we expect it to give the same result each time, even with ties, correct?
They (np) claim this, but it has not been my personal experience straddling different versions of code...
Ok got it.
It appears the AVX-512 argsort
handles ties in a different way:
Linux w/ AVX-512:
>>> arr = np.array([2,2,2,2,2,1,1,1,1,1,3,3,2,3,1])
>>> np.argsort(arr)
array([ 6, 7, 5, 8, 9, 14, 3, 0, 4, 1, 2, 12, 11, 10, 13])
>>> np.__version__
'1.25.0'
>>> arr = np.array([2,2,2,2,2,1,1,1,1,1,3,3,2,3,1])
>>> np.argsort(arr)
array([ 5, 6, 7, 8, 9, 14, 0, 1, 2, 3, 4, 12, 10, 11, 13])
>>> np.__version__
'1.23.5'
Mac OSX:
>>> arr = np.array([2,2,2,2,2,1,1,1,1,1,3,3,2,3,1])
>>> np.argsort(arr)
array([ 5, 6, 7, 8, 9, 14, 0, 1, 2, 3, 4, 12, 10, 11, 13])
>>> np.__version__
'1.25.0'
>>> arr = np.array([2,2,2,2,2,1,1,1,1,1,3,3,2,3,1])
>>> np.argsort(arr)
array([ 5, 6, 7, 8, 9, 14, 0, 1, 2, 3, 4, 12, 10, 11, 13])
>>> np.__version__
'1.23.5'
Ok, this definitely comes down to the optimized argsort
handling ties in a different manner. What's happening for this test is that the parameters tau1
and tau2
are set in a way that the number of particle windows used for SVM training fall in the middle of a set of argsort
ties. Meaning that a few training windows will be different between the optimized and non-optimized version.
I see two possible solutions:
1) Use a different sorting method, ie. stable sort. This would give the same result on all platforms. 2) Change tau1 and tau2 for this test so that the windows are chosen from a set that is not cutoff in the middle of a group of ties.
Either way, I'll have to change the reference centers
in the test since slightly different particle centers are found.
A side note, I'm slightly suspicious of the score
variable that is fed into the argsort
being integers. It is calculated with the query_score
method by computing a cross-correlation over query images/reference windows and then returns a count of correlations above a certain threshold. Many of these counts are ties.
An alternative solution that doesn't involve changing the reference centers is altering the test to check that centers_found
are within a few pixels of the reference centers.
I think maybe we can try doing both 1 and 2 together and hope for the best until we have the new MicrographSource
.
Fixed by #965.
test_apple.py
is failing for the testtestPickCenters
.I am currently attempting to reproduce the failing test locally.