Closed ebolyen closed 6 years ago
open to any ideas, but a very specific seed also doesn't sound very robust
I think having a unit test that uses a seed would work here with some test sequences that are all the same length except for one that's shorter. A seed could be chosen that causes the global minimum to never be included in the subsample, which should verify that the bug is fixed. Are there any issues that would preclude this kind of unit test?
@jairideout, no I think that should work, it'll just be a slightly magical test and will need some comments, but I think it's the best option we have.
oh or if the random function used by the subsample changes, the test might start randomly failing, but once again, there's really not much else that can be done
Sounds good, probably worth adding the test. Let me know when that's ready and I can review!
fixes #64
There isn't a good way to test this as the global min may show up in the random sample (open to any ideas, but a very specific seed also doesn't sound very robust), so we should probably hold off on merging until we can reproduce the original issue and verify this fixes it.
That being said, from the diff it should be pretty clear that
min_seq_len
was calculated on each read and now it is only calculated from the sample the boxplots use.