host_filter task unit tests (auto-generated)

mlin commented 4 years ago

Using this script which runs the workflow to completion on given test inputs, then records all the intermediate inputs & outputs and generates a pytest case for each task, which runs the task individually and then compares its actual & expected outputs. A bunch of pytest fixtures minimize the amount of boilerplate required in each test case. The script functionality is pretty interesting and might eventually make its way into a top-tier miniwdl tool (needs further generalization).

The current host_filter cases are derived from one of the synthetic inputs from the benchmarking paper ("bench3"). More can be slotted in later. Current warts:

6/9 cases are succeeding in GitHub Actions currently, the remaining 3 pending on public access to idseq-database S3 assets.
The lack of world-readability for the GitHub Packages docker registry (even for public repos) will remain an annoying roadbump for use elsewhere.
A woeful hack in RunTrimmomatic was needed to get it to use an adapter_fasta not sourced from S3 (in addition to https://github.com/chanzuckerberg/idseq-dag/pull/308 and motivating https://github.com/chanzuckerberg/idseq-dag/pull/314)

mlin commented 4 years ago

To see what's going on it's probably easier to browse the tree than the PR diff: https://github.com/chanzuckerberg/idseq-workflows/tree/mlin-generate-task-tests/tests/host_filter/tasks

mlin commented 4 years ago

Now with README: https://github.com/chanzuckerberg/idseq-workflows/tree/mlin-generate-task-tests/tests

Marked 3 cases as expected fail (xfail) while we work through how to adapt the S3 interactions.

mlin commented 4 years ago

@kislyuk @katrinakalantar @morsecodist any feedback on whether this seems like a feasible skeleton for adding more test cases+assertions? (Start from the readme and task cases directory as the diff view isn't navigable)

kislyuk commented 4 years ago

@mlin the test structure and fixtures/helpers look good. Do you have any guidance for finishing test coverage for all the steps?

katrinakalantar commented 4 years ago

I agree that the structure looks good!

chanzuckerberg / idseq-workflows

host_filter task unit tests (auto-generated) #10