The accuracy is not great. This could be for a number of reasons, listed in order of likelihood:
1. The corpus and ground truth files do not have text that accurately matches the labels
2. The matching algorithm needs fine-tuned or replaced
3. We should use segmentation to de-noise
4. Our OCR model needs to be retrained or replaced
geography acc: 1/120 = 0.8333333333333334% geography no match: 45/120 = 37.5% geography wrong: 74/120 = 61.66666666666667%