Closed gngdb closed 9 years ago
This might be more important than I originally thought. The images are not perfectly reliably cropped; some plankton overlap the edges. Found an example by accident: train/hydromedusae_sideview_big/100058.jpg
When working on this, probably worth going through the images finding those that overlap to get an idea of how much random crop to apply.
We should make sure we DON'T crop any members of the tunicate class, because there is a subclass called "tunicate_partial" where all the members look like cropped versions of the "tunicate salp" class. If we were to crop the tunicate salp training data, it would only lead to mislabelling with the tunicate partial class. Similarly, we also need to be careful with respect to the "hydromedusae_partial_dark" and "siphonophore_partial" classes.
As in the Baidu paper, random cropping within the image.