Closed tonyelewis closed 7 years ago
Closing this after confirming with Jon that cca8fb43e712e710c8eaee4553fa723e137d1c2a has now fixed the problem on the original file too.
More info on what was going wrong:
The error was occurring after CRH had chosen the hits and was then resolving the boundaries.
In the problem case, two hits had correctly been allowed together (ie been deemed non-conflicting) but only because short segments had been removed due to falling below the --min-seg-length
threshold. The later, boundary-resolving code is given those short segments (so it can do more informative things, eg in HTML output) but it wasn't properly handling the --min-seg-length
issue and so was getting confused by them.
Jon has found (what looks very much to be) an error with
cath-resolve-hits
. I've been able to reduce the input data required to reproduce the error message from Jon's ~4.8Gb file (!) down to the attached ~4.1Kb file.The error can be reproduced with:
...which gives an error:
jon_problem.20170308.hmmsearch.txt