predict on test for metrics and unlabeled images only

abfleishman commented 6 years ago

When there are a lot of images in the blob storage training is fast but prediction is slow. It would be a nice option to only predict on unlabeled images to improve speed and be able to iterate faster and an option to only do metrics on the test set / not predict on the rest of the labeled images.

olgaliak commented 6 years ago

Hi @abfleishman ! Could you please clarify the scenario a bit?

Predict on Test set "for metrics only" I assume the purpose is just to eval how good the current model is now, correct?
Predict on unlabeled images to improve speed I did not quite got how this would improve speed. I assume there are usually thousands of unlabeled images , right?

abfleishman commented 6 years ago

As I understand it now, then a new model is trained is predicts on all of the images in the blob storage that it is pointed at. I have been starting with maybe 1000 images and training and prediction go very quickly. Then I have been adding more images, let's say another 1000, so there are 2000 images in blob storage. If we are only using the workflow for generating new training data, we do not need to predict for the first already labeled 1000 images since we do not need to review them again (hopefully) and we can save time by only predicting on the new 1000 images. this gets more pronounced when the numbers are larger, of course, let's say 10,000 images that have been labeled and 2000 new unlabeled images. Does that make sense?

olgaliak / active-learning-detect

predict on test for metrics and unlabeled images only #17