DataCaptureConfig documentation is needed. I could not find detailed description of each parameters and acceptable values. For example, where is sampling_percentage defined?
Cell Create a baselining job with training dataset.
How does suggest_baseline() generates constraints? Based on what parameters, etc?
This seems to be a Spark job (based on the output log). What resources are consumed by .suggest_baseline Spark job and how much do they cost?
Cell Explore the generated constraints and statistics
baseline_statistics() seems to apply to only built-in algorithms with pre-built containers. It leverages Deequ, computes KLL sketches, etc. Please, provide an example of how the statistics works with non-Amazon algorithms, such as an open-source XGBoost
The notebook seems to use a pre-trained model from https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_applying_machine_learning/xgboost_customer_churn/xgboost_customer_churn.ipynb. The notebook should refer to the data schema from the above example when discussing generated traffic and suggested constraints.
Cell Deploy the model to Amazon SageMaker. THIS REQUIRES MORE EXPLANATION A DOCUMENTATION REFERENCE TO https://sagemaker.readthedocs.io/en/stable/model.html
DataCaptureConfig documentation is needed. I could not find detailed description of each parameters and acceptable values. For example, where is sampling_percentage defined?
Cell Create a baselining job with training dataset.
Cell Explore the generated constraints and statistics baseline_statistics() seems to apply to only built-in algorithms with pre-built containers. It leverages Deequ, computes KLL sketches, etc. Please, provide an example of how the statistics works with non-Amazon algorithms, such as an open-source XGBoost
Cell Create a Schedule: How to run schedule_cron_expression every few minutes for development purposes? Currently, Amazon SageMaker only supports hourly integer rates between 1 hour and 24 hours: https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-schedule-expression.html