bcbi / DeIdentification.jl

A Julia package for de-identifying CSV data sets
https://bcbi.github.io/DeIdentification.jl/latest/
MIT License
2 stars 2 forks source link

Code coverage is not great #34

Closed DilumAluthge closed 5 years ago

DilumAluthge commented 5 years ago

Currently the code coverage is around 54%, which is really low.

Ideally, we would have 100% code coverage.

At the very least, I think we need at least 90% code coverage, and then we see if 100% is feasible.

DilumAluthge commented 5 years ago

@ibacher Can I assign this to you?

ibacher commented 5 years ago

Yes, I’m going to take a look at this anyways! It’s probably a sizeable project, though, since the code isn’t well-structured for unit tests.

DilumAluthge commented 5 years ago

How about writing some end-to-end integration tests? That would probably cover most of the code, and then you’d only need to write a few unit tests.

ibacher commented 5 years ago

We basically have that already. I’m not quite sure why the coverage is so low. It may have something to do with Julia aggressively optimising things.

DilumAluthge commented 5 years ago

Yep. When you’re running the tests on Travis, you should turn off inlining. Also you should run the tests two ways, one with compiled modules enabled and one with compiled modules disabled.

Those changes usually fix the coverage for me.

Here’s an example of a Travis configuration: https://github.com/bcbi/PredictMDAPI.jl/blob/master/.travis.yml

And the corresponding Travis script: https://github.com/bcbi/PredictMDAPI.jl/blob/master/ci/travis/script.sh

ibacher commented 5 years ago

Ok, I've got a handle on this now. The code coverage dropped severely when a series of interactive prompts to build config files was added to the repo. Being interactive, its a bit hard to test automatically, but, in any case, another feature I was working on (build a config from a CSV file) should be trivially automatable and should cover the majority of the same code. This will probably get us to ~80%, which I expect is acceptable given, ultimately, how trivial the core of this package is.

ibacher commented 5 years ago

Coverage is up to 74%, and I think that might be the best we can get it to without a substantial work. The bulk of the remaining uncovered code is directly driven by terminal input & output and the need for that code to be 100% correct is minimal (it's output is a YAML config file meant to be hand-edited afterwards).