A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
Apache License 2.0
1.35k
stars
164
forks
source link
processing of the pre-training dataset IIT CDIP 1.0 #82
Can you please provide the code used to process the pre-training dataset IIT CDIP 1.0? I am now trying to do retraining weights for use with a new encoder. Any help from the developers would be greatly appreciated.
Can you please provide the code used to process the pre-training dataset IIT CDIP 1.0? I am now trying to do retraining weights for use with a new encoder. Any help from the developers would be greatly appreciated.