Closed ghost closed 7 years ago
Are you using the mjsynth data as it is originally instrumented? If not, your labels or input data may somehow be corrupted. Verify that by semi-manually stepping through your data with singleton minibatches to view the images and print the labels.
If those are correct, you may need to try adjusting your step size. You may even want to elide the decay (and/or view it in TensorBoard) to make sure the step size isn't shrinking too fast.
The architecture and training regime coded here usually work sufficiently well for the original training data, but may require different approaches (e.g., incremental layer training as in the original deep VGG net) for drastically different types of data (more classes, different noise patterns).
@weinman Thank you very much for your guidance. I use this project to train Chinese character(contain 8000 characters) . May be it is more difficult to recognize chinese characters use this model .
I'm confused Why doesn't the loss function go down? Who can tell me? I am a beginner.