whylabs / whylogs

An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
https://whylogs.readthedocs.io/
Apache License 2.0
2.65k stars 121 forks source link

Remove references to confusing methods in PySpark profiling example #1533

Closed jamie256 closed 1 month ago

jamie256 commented 5 months ago

Description

The PySpark profiling example notebook uses the collect_column_profile_views method as the first example which returns a dictionary and is not easy to then use to write to WhyLabs. We should start with the most commonlyapplicable method, as this one is a more advanced use case which requires an integration to manage column profiles rather than using the dataset profile or result set as a wrapper.

PySpark Integration

github-actions[bot] commented 2 months ago

This issue is stale. Remove stale label or it will be closed next week.