Open qqibrow opened 5 months ago
There's already precedent for adding end to end integration tests for Prestissimo. I think it makes sense for there to be end to end integration tests for this in Presto, just as we have end to end integration tests that test specific functions (even though they have micro coverage in Velox as well).
@tdcmeehan thanks! could you share issues or PR example where to put such test? thanks!
Here are some examples of end to end tests which were added recently:
IEEE functions: https://github.com/prestodb/presto/pull/22469 CTAS tess: https://github.com/prestodb/presto/pull/20936
CC: @majetideepak @aditi-pandit @yingsu00
@qqibrow Hi Lu, the Presto native tests are at presto-native-execution/src/test/ directory as @tdcmeehan pointed out. Thanks @tdcmeehan ! Some other examples are like TestPrestoNativeAggregations, TestPrestoNativeWindowQueries, etc.
But these tests need to build a native query runner and run the SQL queries from end to end. For our purpose to test the Velox Parquet reader, we don't want to run the SQL queries from end to end, but just reuse Presto existing unit tests.
I think it's ok to put the tests in com/facebook/presto/hive/parquet as in your original PR [Testing] [Don't review] Velox parquet test use presto as the tests are in com/facebook/presto/hive/parquet/ParquetTester.java, or put it also in presto-native-execution/src/test/ . I don't have very strong opinion on whether to choose one over the other. @aditi-pandit What's your recommendation?
But I also noticed the Parquet code and tests are spreaded in both presto-hive and presto-parquet. I don't know why this is the case, but we can do some refactoring to merge them into once place later. @zhenxiao Do you know why? Do you have any recommendation where we should put the native Parquet reader tests?
@yingsu00 : Adding the tests in com/facebook/presto/hive/parquet as in the PR https://github.com/prestodb/presto/pull/21444/files#top looks good. I just had a quick look at private static void assertNativeReaderFileContents(..) function. That approach should be fine.
thank you @yingsu00 the placement of Parquet tests is a legacy issue. Initially, the old parquet reader(which has been totally removed) is part of presto-hive, so all testcases are in presto-hive. After we developed a new parquet reader(probably in 2016 or 2017?), we move parquet related code to presto-parquet, and added parquet tests there. But there are still some legacy dependency, make it challenging to separate parquet from hive cleanly. It is a good idea to merge all of them into one place, say, presto-parquet. I think the presto parquet testcases are extensive and have good coverage. Re-use them for velox would be awesome
We’ve developed a test solution to test velox velox native parquet reader using presto unit tests. we want to discuss whether we can add that in CI in presto project. There is already a discussion thread in velox https://github.com/facebookincubator/velox/discussions/9392. According to Masha, prestodb seems a better place for this test.
Expected Behavior or Use Case
Test coverage in velox parquet is not enough. Testing velox parquet using presto unit test is the most practical approach for now.
Presto Component, Service, or Connector
presto-parquet unit test
Possible Implementation
Check high level implementation and draft PR here https://github.com/facebookincubator/velox/discussions/9392
Example Screenshots (if appropriate):
Context
For more details please check https://github.com/facebookincubator/velox/discussions/9392.