Sorry if this is out of place, but I just stumbled across an oddity. It appears that the Google-digitized non-English editions have some habitual problems in the OCR which shows up in the boilerplate they inserted.
For instance, Googling: "carcfully scannod" site:archive.org
turns up 46,900 results, most of which are scanned from texts in languages that use diacritics. That can't be a coincidence. I'm wondering if it can be put to use for quality improvement. Might they just need a fresh run through OCR with more modern software?
Original text from https://github.com/internetarchive/openlibrary/issues/810:
More discussion in the thread.