Open yaofuzhou opened 6 months ago
Let me know when you want to go over my already implemented modifications (that do not work yet).
I suggest to put it in a feature branch in your GitHub fork of Tesseract, so other people can see it.
I reformatted your comment.
CC @bertsky,
Maybe you can help @yaofuzhou with this new feature.
I just pushed my own unfinished efforts: https://github.com/stweil/tesseract/tree/dropout.
[Edited]
This is my implementation of the dropout feature so far - https://github.com/yaofuzhou/tesseract I have gone over @stweil 's code and it seems that we are trying to approach it in a very similar way.
There are aspects from @stweil 's code that I can learn from, and I will try to incorporate those into my code and give full credit to @stweil in the process.
My original description remains the same, namely -
My code compiles but does not run. Specifically, the tesseract
and lstmtraining
binaries yield the error messages
dyld[2292]: symbol not found in flat namespace '__ZN9tesseract7Network11DeSerializeEPNS_5TFileE'
[1] 2292 abort ./lstmtraining
dyld[2292]: symbol not found in flat namespace '__ZN9tesseract7Network11DeSerializeEPNS_5TFileE'
[1] 2329 abort ./tesseract
respectively, which means that I am probably missing something elsewhere in the Tesseract codebase. I tried to search for convolve
and maxpool
to see where these parallel components show up, but have not found the solution. This is probably where I need help the most.
I need to implement a flag/switch somewhere so that the dropout mechanism is only activated during the training process (running the lstmtraining
binary) and not during normal usage (running the tesseract
binary).
Ideally, I need to implement a mechanism to adjust the dropout_rate
for each dropout layer when the lstmtraining
binary continues from a checkpoint, as it may be desirable to turn off the dropout feature when the training converges to a good finish.
Your Feature Request
I am trying to implement the feature of dropout layers for Tesseract. For now, the hope is to enable something like, say, "Dr0.2" or so to the VGSLSpecs syntax. I implemented some of the code, but have encountered a few issues, and I figure this may be the place for discussion.
This is not surprising, as I am sure there are additional and essential modifications needed on other parts of the codebase.
It is obvious that I need to be able to disable the dropout feature for the deployed
.trainedmodel
s, for which I may need to further modifynetwork.cpp
. I need to ask the community about the best practice in terms of adding the new flag or switch for this purpose.Ideally, I want to, when continuing training from a checkpoint, be able to adjust the dropout rate(s) to a different value(s), including setting it/them to 0 (perhaps when the training is converging). There is probably more than one way to do it, but I want to ask the community for the best practice.
Let me know when you want to go over my already implemented modifications (that do not work yet).