MantisAI / sagemaker-test

0 stars 0 forks source link

Things to do #1

Open ivyleavedtoadflax opened 4 years ago

ivyleavedtoadflax commented 4 years ago
ivyleavedtoadflax commented 4 years ago

Note to be able to restart after an interrupted spot instance training, you must check to see whether a checkpoint exists or not. We must also send the final data to the model, we can't rely on a final processing step (e.g. tokenisation, etc.) to be done in the container, it must receive the final data.

ivyleavedtoadflax commented 4 years ago

Need to pass environment variables to sagemaker containers to define MLFlow server