Brief Description
Valid files with a funny encoding can get stuck in an everlasting loop between clean and lakify
Severity
Medium - it waste resources and the file is never processed, but we think it's a very rare condition.
Steps to Reproduce
If a file is valid according to the validator but has a funny encoding, it can get stuck in an everlasting loop:
refresh/reload (download to source)
validate (found to be valid)
clean (copy valid, bypassing encoding/lxml check)
lakify (lxml parsing error, sent back to clean by db.sendLakifyErrorToClean func)
Expected Results/Behaviour
At this point, the intention was that the clean stage would treat the document as invalid and use the copy invalid function on it. This should handle the encoding error and next time lakify would work.
Actual Results/Behaviour
However, because the clean stage just checks val.valid = true to work out whether to use the copy valid or copy invalid routine, then this doesn't happen. Instead, it just runs around clean and lakify in a everlasting loop.
Brief Description Valid files with a funny encoding can get stuck in an everlasting loop between clean and lakify
Severity Medium - it waste resources and the file is never processed, but we think it's a very rare condition.
Steps to Reproduce If a file is valid according to the validator but has a funny encoding, it can get stuck in an everlasting loop:
Expected Results/Behaviour At this point, the intention was that the clean stage would treat the document as invalid and use the copy invalid function on it. This should handle the encoding error and next time lakify would work.
Actual Results/Behaviour However, because the clean stage just checks
val.valid = true
to work out whether to use the copy valid or copy invalid routine, then this doesn't happen. Instead, it just runs around clean and lakify in a everlasting loop.(Worked out with @akmiller01 )