aws-samples / amazon-a2i-sample-jupyter-notebooks

Sample Jupyter Notebooks for Amazon Augmented AI (A2I)
https://aws.amazon.com/augmented-ai/
Apache License 2.0
69 stars 64 forks source link

Question about the way to use new data for incremental training #25

Open tkashi opened 4 years ago

tkashi commented 4 years ago

Hi there,

I really appreciate the notebook Amazon A2I with Amazon SageMaker for object detection and model retraining to know how to use A2I and incremental training.

I have a question about how to retrain with new labeled data. In the current code, you use only new dataset, but the guide of the incremental training says that we should use "an expanded dataset". Does this mean that we have to use the datasets including both one from the first training and one obtained from A2I? Or can we use only the new dataset by using an augmented manifest file?

At least, when I retrained my model with only new dataset using RecordIO format, the model accuracy highly decreased.

I hope you would modify the current code to use "an expanded dataset", which should be more helpful for us to understand how to use A2I and incremental training.

Thanks,

michaelhsieh42 commented 3 years ago

Hello @tkashi , Thanks for your feedback. Sorry for a delayed response.

Our object detection algorithm does not require the original (first training) data in the incremental training. You may include the full, a subset or none of original training data into the incremental training. What may have more impact on the performance is the choice of hyperparameter, most critically the learning rate. As the previous model has reached a more stable state and the learning rate most likely has reduced from what you used at the beginning, you would need a smaller starting learning rate to begin the incremental training. If the learning rate is not appropriate, your might again jump out of the optimum in the optimization. I would suggest to perform a hyperparameter tuning to find new best hyperparameters. Additionally, I suggest to mix some if not all old validation data with A2I labeled new data to make sure the model does not deviate from your first model towards new data.

Thanks, Michael Hsieh