qiime2 / q2-demux

BSD 3-Clause "New" or "Revised" License
0 stars 20 forks source link

BUG: make min-seq length local to sampling #65

Closed ebolyen closed 6 years ago

ebolyen commented 7 years ago

fixes #64

There isn't a good way to test this as the global min may show up in the random sample (open to any ideas, but a very specific seed also doesn't sound very robust), so we should probably hold off on merging until we can reproduce the original issue and verify this fixes it.

That being said, from the diff it should be pretty clear that min_seq_len was calculated on each read and now it is only calculated from the sample the boxplots use.

jairideout commented 7 years ago

open to any ideas, but a very specific seed also doesn't sound very robust

I think having a unit test that uses a seed would work here with some test sequences that are all the same length except for one that's shorter. A seed could be chosen that causes the global minimum to never be included in the subsample, which should verify that the bug is fixed. Are there any issues that would preclude this kind of unit test?

ebolyen commented 7 years ago

@jairideout, no I think that should work, it'll just be a slightly magical test and will need some comments, but I think it's the best option we have.

ebolyen commented 7 years ago

oh or if the random function used by the subsample changes, the test might start randomly failing, but once again, there's really not much else that can be done

jairideout commented 7 years ago

Sounds good, probably worth adding the test. Let me know when that's ready and I can review!