broadinstitute / viral-classify

viral-ngs: taxonomic identification, classification, and filtering steps
Other
1 stars 0 forks source link

Restructure project towards a traditional python package #12

Open yesimon opened 4 years ago

yesimon commented 4 years ago

Eventually make all executables subcommands of classify. Disperse functionality from top-level executable scripts like metagenomics.py and taxon_filter.py into the classify library.

Restructure tests to remove unit/ folder and place unit integration tests together for each file, with pytest marks to distinguish between unit and integration tests.

dpark01 commented 4 years ago

You know I was just thinking about this recently and it warrants a conversation among us. I haven't had a chance to dive into this but I'm certain that the right place to start with this restructure is not viral-classify, but viral-core.

My thought is for each modular github repo:

Then viral-classify, -assemble, -phylo, etc, will list viral-core as a formal dependency, in both conda world and pypi world.

viral-pipelines remains nothing more than WDLs

In this world, viral-core has to be restructured first.

Thoughts?

yesimon commented 4 years ago

Agreed. I'm not sure if we should actually upload pypi packages because they have many other dependencies, but my effort here is just to restructure it to be more similar to a traditional pip-installable python package, where we can use setup.py directives to install the executable scripts.

I think restructuring viral-x can be done independently from restructuring viral-core, especially when restructuring within the package as I'm doing here, or removing the tool classes because we can always assume conda dependencies are available.

tomkinsc commented 4 years ago

Agreed, and I like your plan, @dpark01. We should probably all talk about making broadinstitute/viral-ngs read-only at some point as well.

dpark01 commented 4 years ago

I took a high level skim here (haven't really dove into the details) and I really like the overall shape of this PR. Two big picture things at the moment:

  1. I'd want to verify that various wdls in viral-pipelines still work as before if we manually set their docker strings to quay.io/broadinstitute/viral-classify:sy-restructure-classify (and if not, how can we tweak the docker build / path / whatever to remain backwards compatible).
  2. Can you read through DEVELOPMENT_NOTES.md and include any necessary changes there in this PR?