aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
https://sagemaker-examples.readthedocs.io
Apache License 2.0
9.98k stars 6.73k forks source link

SageMaker-ModelMonitoring - how does it work with non-Amazon algorithms? Also needs better documentation #971

Open sermolin opened 4 years ago

sermolin commented 4 years ago

The notebook seems to use a pre-trained model from https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_applying_machine_learning/xgboost_customer_churn/xgboost_customer_churn.ipynb. The notebook should refer to the data schema from the above example when discussing generated traffic and suggested constraints.

Cell Deploy the model to Amazon SageMaker. THIS REQUIRES MORE EXPLANATION A DOCUMENTATION REFERENCE TO https://sagemaker.readthedocs.io/en/stable/model.html

DataCaptureConfig documentation is needed. I could not find detailed description of each parameters and acceptable values. For example, where is sampling_percentage defined?

Cell Create a baselining job with training dataset.

Cell Explore the generated constraints and statistics baseline_statistics() seems to apply to only built-in algorithms with pre-built containers. It leverages Deequ, computes KLL sketches, etc. Please, provide an example of how the statistics works with non-Amazon algorithms, such as an open-source XGBoost

Cell Create a Schedule: How to run schedule_cron_expression every few minutes for development purposes? Currently, Amazon SageMaker only supports hourly integer rates between 1 hour and 24 hours: https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-schedule-expression.html

clausagerskov commented 7 months ago

nothing?