Closed leoisl closed 5 years ago
I think using Mocking is a great idea! Although the bug that you are talking of is not really a bug because the protein sequence generated will never be lowercase in the current implementation!
So, what I would suggest is creating a PR for adding the ability to generate lowercase protein sequences as well and adding your Mock tests as part of that. I have created a new issue #39 to track this.
PS Even if you are not a part of Hackseq, you can still contribute :)
Hello,
Cheers! Thanks for the answer! I think the current implementation can generate lowercase protein sequences! It was added in commit https://github.com/Lab41/hypothesis-bio/commit/27f821bd6eba0b03f4dd235a7261c47b5e292cfb
Aha, completely missed that.
Yup, a PR would be great then.
Description
Hello. I am interested in this project, but can't attend the hackseq, so I am unsure if I can contribute.
I tried to improve unit tests, and found a test that actually exposes a bug. It is hard to test this code, and although the functions you are implementing can generate a diverse range of objects (e.g. very diverse fasta/protein sequences), the tests are biased towards the minimal generated values (but maybe I misunderstood the tests, sorry if this is the case). Instead of letting hypothesis draw the random elements for us, which is what makes some tests hard I think, we can mock the
draw
function and fix the element we want to draw, and this might or might not help testing (the mocking can also inspect the variables passed to e.g. hypothesis'stext()
ordraw()
, making sure the arguments are correct, e.g. when you want a random dna sequence with no gaps, you calldna(allow_gaps=False)
, then you can check if thealphabet
given totext()
indeed has no gaps).Going more direct to the point: this commit adds two more tests to
test_protein.py
: https://github.com/leoisl/hypothesis-bio/commit/28e5316f0ade1e5f242c77f2e4387e0fc34c370d .The first test: https://github.com/leoisl/hypothesis-bio/blob/28e5316f0ade1e5f242c77f2e4387e0fc34c370d/tests/test_protein.py#L60-L70 fixes the drawn protein to
ACD
and transform it toAlaCysAsp
and everything works as expected.However, the second test: https://github.com/leoisl/hypothesis-bio/blob/28e5316f0ade1e5f242c77f2e4387e0fc34c370d/tests/test_protein.py#L73-L80 fixes the drawn protein to
acd
(since we can generate lowercase protein sequences), and tries to transform it to the 3-letter format, but fails becauseutilities.protein_1to3
does not contain lowercase entries.If you think this could be useful to the other tests (this should be interpreted just as an example that it could), I can make a PR, otherwise I will close this issue.