Open brucegarro opened 5 years ago
Hi @brucegarro, I see there's a file specification here. You can read these files in python as strings of binary format using the struct library. In this project, I do this here, which hopefully is a decent example. Let me know if that helps -- I can see about implementing it here if it doesn't. Best, Lucas
Thank you for your response @lucaskjaero :pray:
Hello @lucaskjaero, I have a project similar to yours where I've implemented some Chinese character recognition models using the CASIA data sets. For my project, I've similarly used the CASIA competition GNT files, but I believe it should be easier to build performant models on the HWDB1.X and OLHWDB1.X data sets because they are five times larger. Unfortunately, those data sets use a different file format MPF. Do you have any idea how to process these files using Python?
Datasets: http://www.nlpr.ia.ac.cn/databases/handwriting/Download.html
My Project: https://github.com/brucegarro/chinese-character-recognition