Remove references to confusing methods in PySpark profiling example

whylabs / whylogs

An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈

Apache License 2.0

2.65k stars 121 forks source link

Description

The PySpark profiling example notebook uses the collect_column_profile_views method as the first example which returns a dictionary and is not easy to then use to write to WhyLabs. We should start with the most commonlyapplicable method, as this one is a more advanced use case which requires an integration to manage column profiles rather than using the dataset profile or result set as a wrapper.

PySpark Integration

[ ] I have reviewed the Guidelines for Contributing and the Code of Conduct.

whylabs / whylogs

Remove references to confusing methods in PySpark profiling example #1533

Description