Open emedvedev opened 7 years ago
Old weights do not work anymore due to version change. Can you upload weights of new model if you have trained it. I don't have such powerful machine
I'm using another dataset to train the model, so can't help here, unfortunately, but my suggestion would be to use Google's ML Engine: https://cloud.google.com/ml-engine/. They give $300 in credits for trial run, which is more than enough for training (mine took around $20 with the BASIC_GPU
instance, and the dataset was way bigger than the one in the example here).
Another suggestion, if you decide to go the Cloud ML route, would be converting your dataset into one large TFRecords file instead of thousands of individual small images. Otherwise I/O will become a very critical bottleneck for you.
Here's a gist on how to generate the TFRecords file: https://gist.github.com/emedvedev/dd056666337b54c13176da93d5b987b7
You'll also have to modify src/data_util/data_gen.py
to read from this file though, so it might be too much work (I did have to make quite a lot of changes to the tooling around this model in my fork), but it does make training significantly faster.
That can be done. Do you have any idea how much time will it require approx?(with BASIC_GPU)
A couple hours to a couple days depending on how comfortable you are with tensorflow. :) I'll update my fork today and document all the changes, so maybe you will be able to just use it without changing the code too much.
i was talking about training time. Well if your work is going to save me time of data conversion coding, i am waiting desperately ;-)
@arpitkh96 I've moved my fork to https://github.com/emedvedev/attention-ocr and changed the interface quite a bit. It's also bundled into a package now.
Didn't have the time to update the README and clean up, but here's the brief instructions:
If you want CLI, clone the repo and run pip install .
inside. Then you'll be able to use the aocr ...
command. Otherwise you can do python -m aocr
inside the repo dir instead.
For dataset generation, you'll need a .txt file with annotations in the format of image/path.png yourimagetext
as described in README. To merge the files and the annotations into a .tfrecords
file you can use aocr dataset
:
aocr dataset datasets/annotations-training.txt datasets/training.tfrecords
aocr dataset datasets/annotations-training.txt datasets/testing.tfrecords
aocr train
:aocr train datasets/training.tfrecords
aocr test
:aocr test datasets/testing.tfrecords
The fork is still a little screwed up (the logs don't make too much sense during the training stage, for instance), but the basics are working, and converting to tfrecords makes it much faster.
TensorFlow API has been changed again, updating the code to reflect the recent changes.