sourmash-bio / sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.
http://sourmash.readthedocs.io/en/latest/
Other
473 stars 80 forks source link

refactor duplicate code in test_sourmash.py #2917

Open ctb opened 9 months ago

ctb commented 9 months ago

per @LilyAnderssonLee review

Refactoring duplicate code into helper functions to avoid redundancy and improve maintainability.

ctb commented 9 months ago

Have been thinking about what to do here 😅 .

I am the primary architect (instigator? culprit?) of the Python tests, and the code duplication is a direct result of my testing philosophy, which is that tests should be simple, "flat" (not call a lot of test-specific functions), have low code complexity (if and for statements deprecated), and otherwise not be viewed as code to be maintained.

This has led to a lot of ugliness in the tests, but at the same time, I spend very little time in the test code. If a test starts failing due to a significant behavior change in the primary code, I advocate simply deleting the test (with a note in the commit message, of course). But this happens fairly rarely. So, in general, I do not feel like the current tests need a lot of love.

On the flip side, the tests are ugly, and have a lot of duplicated code, and can be hard to understand in their bulk. I am wondering if there is an automated code refactoring procedure that we could point at them, e.g. something like copilot, that would discover widely shared code blocks and/or take care of some of the legacy things like decorators being used rather than test fixtures? @mr-eyes any thoughts here?

mr-eyes commented 9 months ago

I have yet to try copilot in writing tests, but overall, it's super challenging for the copilot (at least for now) to understand the testing context and generate the unit tests like the ones on sourmash. However, copilot might be able to modernize some tests. TLDR I don't recommend using copilot in rewriting/generating tests specifically for sourmash, it's too complicated.