Closed d33bs closed 3 months ago
Thinking on this more and exploring a bit, outlining thoughts and findings so far below. This appeared again in #181, so I'm focusing on figuring out more of the reasons why this might occur.
When this happens, the following appear to be consistent patterns:
python==3.12
work in #174)duckdb==0.10.1
pyarrow==15.0.2
As a quick check I tried verifying that PyArrow sorting works the way it should. It seems that it does properly sort all values by all columns when implemented the way it is in CytoTable tests. See here for code demonstrating this.
I feel there are several other possibilities for what's occurring which I'll work through in order to verify what's happening.
From #174:
Because of the importance of this issue, adding that we need example cases where the fix has been validated with larger than testing datasets.