askimed / nf-test

Simple test framework for Nextflow pipelines
https://www.nf-test.com
MIT License
128 stars 17 forks source link

Code coverage #11

Open mgonzalezporta opened 1 year ago

mgonzalezporta commented 1 year ago

Hi there,

Is it possible to obtain a code coverage report for Nextflow scripts using nf-test? If not, would you have any alternative suggestions?

Many thanks in advance!

edmundmiller commented 1 year ago

https://github.com/pcolby/tap-summary https://github.com/nf-core/methylseq/blob/5893122512add5350e4d9e3235b8e2fb500e5679/.github/workflows/ci.yml#L70-L76

ivopieniak commented 1 year ago

Hi @mgonzalezporta,

Is it possible to obtain a code coverage report for Nextflow scripts using nf-test? If not, would you have any alternative suggestions?

  1. Are you interested in getting a report of executed tests and how did they performed per @Emiller88 suggestion?
  2. If not, then I would assume that the feature that you would want would be some sort of Code Coverage tool like Pytest? Something that would measure the amount of code lines that were executed?
ivopieniak commented 1 year ago

After some deeper thinking, me, @aaron-fishman-achillestx and @byb121 came to a conclusion that implementing a code coverage tool should involve discussion on the paradigm and level of coverage measurement first. Here are a few questions that we think that would help to develop the code coverage feature:

  1. What level of the granularity should be considered the lowest when measuring the coverage using nf-test? Would that be a workflow, process or function?
  2. How do we want to measure the coverage? Will that be a simple measurement of executed lines or would we want to measure some of the branch coverage as well? If this would be a measurement of executed lines, then how should we approach shell blocks containing code from a different programming language?
  3. How useful would that actually be to know the code coverage of Nextflow unit tests? Would it be the % value of test_files/ whole NF codebase? From my understanding, most of the times your process will execute one command that has specific purpose, without additional branch logic, so de facto existence of your test will guarantee 100% code coverage of the file - will code coverage by useful at that point?

I believe, that aproper scope should be defined on what users would want to have out of this feature to be able to progress with code coverage. I would also assume that this could be implemented as a plugin or a core functionality @lukfor ?

edmundmiller commented 1 year ago

I was thinking this would make sense as a plugin.

how should we approach shell blocks containing code from a different programming language

I think just an if statement in a script: directive would be nice to know if that was covered or not.

How useful would that actually be to know the code coverage of Nextflow unit tests?

This is a good point. Probably not a huge priority IMO, but at a minimum telling you how many .nf files are covered would be nice just to see if you're missing anything.

zachcp commented 1 year ago

+1 for this.

mcallaway commented 11 months ago

I'm also interested in this. My interpretation of the request (to clarify in relation to this comment, is as follows.

Given a project that implements a set of X functions, Y processes, and Z workflows, it would be nice to run something like:

nf-test test coverage

And see a report that A/X functions, B/Y processes, and C/Z workflows are run by tests, together with a summary of their pass/fail status.

kenibrewer commented 4 months ago

I'll add a +1 for the coverage feature. Knowing whether or not we'll have the equivalent of coverage run -m pytest was one of the first questions my team had when I started talking up nf-test. I think using the coverage tools from the python ecosystem could be a good model for how we should think about implementing coverage here. I also think that it's worth thinking about this from the perspective of what is the minimum viable product (MVP) that will begin giving value to our users.

What level of the granularity should be considered the lowest when measuring the coverage using nf-test? Would that be a workflow, process or function?

Coverage should be measured at multiple levels so that CI can fail a PR for reducing coverage at the file and overall level. For an MVP, we should measure coverage at the level of each *.nf file and for the whole repo.

How do we want to measure the coverage? Will that be a simple measurement of executed lines or would we want to measure some of the branch coverage as well?

An MVP that only measures lines executed would already be very valuable. A fast-follow addition that adds branch coverage would also be very valuable but that could wait.

If this would be a measurement of executed lines, then how should we approach shell blocks containing code from a different programming language?

For an MVP we should focus solely on code written in nextflow/groovy. The only other language that is potentially worth considering in my view is bash, but my cursory search didn't reveal a mature bash test coverage that we could tap into.

How useful would that actually be to know the code coverage of Nextflow unit tests? Would it be the % value of test_files/ whole NF codebase? From my understanding, most of the times your process will execute one command that has specific purpose, without additional branch logic, so de facto existence of your test will guarantee 100% code coverage of the file - will code coverage by useful at that point?

I think this is an extremely useful feature. Although the simplest use case of a nextflow process is very straightforward as you describe, mature nextflow pipelines can grow substantially in complexity. For example nf-core/rnaseq supports multiple aligners and there is sometimes branching logic needed in downstream processes based on which upstream process was used.

All in all, I think this is extremely valuable feature and one that is likely to drive a greater adoption of nf-test by developers and organizations by providing a clear measurable metric by which they can assess the quality of their testing.

lukbut commented 3 months ago

We’d be very interested in this at Genomics England as we have minimum coverage thresholds which we must adhere to in order to use the code in production, since our bioinformatics workflow is an accredited medical device. Any suggestions on how we can calculate coverage in the meantime? @edmundmiller I spotted your link in the second comment above but I wasn’t sure what to make of it

kenibrewer commented 3 weeks ago

@lukfor I noticed in your nf-test pre-print (awesome read by the way) that there is a discussion of calculating test coverage. I couldn't find any documentation or details about how that was done. Is there perhaps some way already in place to accomplish that?