Open bertsky opened 5 years ago
A workaround is to insert a limit into the current number of alignments
as a second (non-terminal) case in backtraceFrom()
: for example,
elif len(alignments) > 100*max(f.shape):
return # enough is enough
But surely there must be a cleaner way…
When globally aligning sequences that deviate much, combinatory explosion can quickly leed to excessive runtime memory consumption in the current implementation. And it is not always easy to detect those cases by score heuristics in a prior
backtrace=False
pass.I believe these should be added:
(The limit could be based on stack depth or number of alternatives, for example.)
Example: I am trying to align OCRed images of German Fraktur script with their corresponding ground truth text. Sometimes the OCR fails miserably like so:
Mitreden andrer 274. Günſtiger Eindruck der Staatsrathsſitzungen 274.
(original line)*0obe-ondrer '? '-änſiger Eindrue der Torerotheflgg,,
(OCR result) In this case, usingStrictGlobalSequenceAligner
tries to take more than 20 GB RSS (at which point I quit).