capitalone / DataProfiler

What's in your data? Extract schema, statistics and entities from datasets
https://capitalone.github.io/DataProfiler
Apache License 2.0
1.43k stars 160 forks source link

Feature: added parquet sampling #1070

Closed menglinw closed 10 months ago

menglinw commented 11 months ago

parquet sampling function developed in data_utils.py; Added sample_nrows argument in ParquetData class; Added test_len_sampled_data in test_parquet_data.py

CLAassistant commented 11 months ago

CLA assistant check
All committers have signed the CLA.

taylorfturner commented 11 months ago

@menglinw two things: