aws / sagemaker-python-sdk

A library for training and deploying machine learning models on Amazon SageMaker
https://sagemaker.readthedocs.io/
Apache License 2.0
2.09k stars 1.13k forks source link

Use of boolean hyperparameters: recommended use of `type=bool` is not okay #4247

Open rq-aszasz opened 10 months ago

rq-aszasz commented 10 months ago

What did you find confusing? Please describe.

doc/overview.rst states:

Note that SageMaker doesn't support argparse actions. If you want to use, for example, boolean hyperparameters, you need to specify type as bool in your script and provide an explicit True or False value for this hyperparameter when you create your estimator.

In accordance with Python 3's argparse documentation, the bool() converter will convert any non empty string to a True value:

The bool() function is not recommended as a type converter. All it does is convert empty strings to False and non-empty strings to True. This is usually not what is desired.

If one parses as per your recommended way, regardless of the default

parser.add_argument('street', type=bool, default=...)

it becomes impossible to set an explicit False configuration from the caller. This is dangerous and misleading.

Describe how documentation can be improved Do not say this:

If you want to use, for example, boolean hyperparameters, you need to specify type as bool in your script and provide an explicit True or False value for this hyperparameter when you create your estimator.

The False scenario will not work, it will set the parameter to True.

Additional context

JoshCole-DTA commented 10 months ago

I agree with @rq-aszasz , additionally, the base estimator has the variable container_arguments, why can't this be used in the same way as processor jobs to provide arguments as a list?

david-waterworth commented 6 months ago

IMO there needs to be substantially better reviews of enhancements added to the various containers before pull requests are merged. As @JoshCole-DTA pointed out we have functionality of base classes that isn't available in derived classes, and the same occurs in reverse - Processor is a classic example where some framework processors support script dirs (i.e. multiple files) and requirements.txt, some only support a single script file and no requirements.txt etc. This really should not pass review.

Plus containers are updated with newer versions of the framework but as far as I can see the documentation saying what images are available is often not updated at the same time (i.e. this doesn't have the latest Triton inference versions - https://github.com/aws/deep-learning-containers/blob/master/available_images.md).