aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
https://sagemaker-examples.readthedocs.io
Apache License 2.0
10.1k stars 6.76k forks source link

Using Tensorboard for vizualization with Ray in SageMaker RL #1488

Open jcur opened 4 years ago

jcur commented 4 years ago

My customer needs to use Tensorboard when using Ray in SageMaker RL. Could you please share an example for how to set up Tensorboard when running a RL simulation with Ray in SageMaker RL? So we can visualize the evolution of the RL simulations in real time

annaluo676 commented 4 years ago

The easiest way would be to set up upload_dir = destination_s3_path in the experiment config, and run

AWS_REGION=your_region tensorboard --logdir s3://destination_s3_path --host localhost --port 6006

By default ray will update the TensorBoard event file every 5 minutes.

henryyuanheng-wang commented 4 years ago

RLEstimator doesn't seem to take argument upload_dir for me. Should we instead pass tensorboard_output_config?

Also, Is there anyway to configure the logging frequency?

Thanks!

annaluo676 commented 4 years ago

Hi @henryyuanheng-wang , upload_dir is a Ray Tune argument and thus RLEstimator should not be an issue. A sample configuration would be something like

{
"experiment_name": "training",
"run": "PPO",
"env": "your_env",
"stop": {
    'training_iteration': 500,
            },
"upload_dir": "destination_s3_path"
}

More details can be found in this notebook example if you search for "Tensorboard".

annaluo676 commented 4 years ago

Can you elaborate a bit on the logging frequency, especially on whether it's on Ray's side vs. on SageMaker's side.