danvk / oldnyc

Mapping photos of Old New York
Apache License 2.0
288 stars 130 forks source link

Do some minor cleanup to OCR text #50

Closed danvk closed 9 years ago

danvk commented 9 years ago

Fixes #39

This:

I dropped all OCR lines within a small edit distance of one of the known attribution lines. These lines tend to have many character recognition issues, so some leeway is important.

The corresponding data update is https://github.com/oldnyc/oldnyc.github.io/commit/65640f3bc3a31e269f4a37ca4c59e1dd9f71c5c5