apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.64k stars 1.06k forks source link

Reduce test duplication in tests for data page stattistics #11000

Open alamb opened 1 month ago

alamb commented 1 month ago

Follow on to https://github.com/apache/datafusion/pull/10982

@tshauck noted there is non trivial repetition in some tests

@tmi says:

@tshauck this is one of the things that were bugging me a bit, the other one being https://github.com/apache/datafusion/pull/10982/files#diff-7110f4709c105a18ef74a212396444d62052179a735d148fb62470a8b157fb40R749-R763 -- both are very repetitive

however, I didn't want to get overeager, only to realize later than the abstractions chosen was not the right one. Perhaps the best way forward would be to address https://github.com/apache/datafusion/issues/10952 first (which may also have its own "float16"-like tricky case), and then getting the correct macros for testing, index handling, etc. Speaking of that, would you like to take that one? I'd be happy to review then

_Originally posted by @tmi in https://github.com/apache/datafusion/pull/10982#discussion_r1645504130_

alamb commented 2 weeks ago

I propose we consolidate / review the tests when we move this code upstream: https://github.com/apache/arrow-rs/issues/4328