Open sujeshchirackkal opened 8 years ago
If we consider a row in the input data as a test case this becomes a bit complicated as it will then require to generate data to cover all filter, group scenario. Are we looking at the generator in that sense? or are we focusing just to generate text files that conforms to the input thrift.
A similar step that we do as part of prep is that we need to create parquet files as if in a partitioned table, so that can be considered here as well?
This is not of high priority. But it is good to have a custom test data generator (also can explore other open source repos which can be reused) to enable easy testing for users.