whylabs / whylogs

An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
https://whylogs.readthedocs.io/
Apache License 2.0
2.65k stars 121 forks source link

Multi org process logger #1575

Closed naddeoa closed 1 week ago

naddeoa commented 1 week ago

We need to support multi org workflows. Currently, the process rolling logger stores all of its profiles keyed off of the dataset id, assuming that the global whylabs api key applies to all of them more or less. Technically, its already possible to create custom writers which are the only things that know about the org but the logger would end up rolling data for different orgs into the same profiles if they have same org id.

This change allows you to explicitly pass an org id into the logger which is used for cached profiles when logging. That in combination with a custom writer factory gets you to multi org logging.

This also updates some of the types for the most recent pyright version, despite whylogs not actually depending on that yet.

This does end up having to be a breaking change unfortunately, though all of these logger and actor types are still in the experimental namespace.