dnanexus-archive / viral-ngs

viral-ngs
6 stars 6 forks source link

Change Novoalign source to use conda package and separate novoalign.lic file #26

Closed tomkinsc closed 8 years ago

tomkinsc commented 8 years ago

Novoalign is available as a conda package, so we no longer need to unpack it as a custom tarball and set NOVOALIGN_PATH. The DNAnexus wrappers can omit setting NOVOALIGN_PATH, which will cause viral-ngs to install the conda version of novoalign (or the conda package can be installed in advance). By default, the conda package runs in single-threaded mode until a novoalign.lic file is provided in the same directory as the novoalign binary (in this case <conda-env>/bin). The novoalign.lic file from the old tarball can be reused, but we would ideally like to be able to specify it as an input for the applets.

As of this commit the novoalign.lic file can be copied to the conda environment in two ways:

yifei-men commented 8 years ago

Thanks for the update @tomkinsc

Clarifying, Novoalign requires a commercial license for multi-threaded operation (which is why the novoalign.lic file is need right?)

If that's the case, I'll plan to expose the novoalign.lic file as an input that the users of the public assembly workflows can provide (instead of the current Novoalign tarball), and then export NOVOALIGN_LICENSE_PATH to point to this provided file for viral-ngs to handle.

tomkinsc commented 8 years ago

Exactly, the paid license buys multi-threading, but the binary is the same for non-commercial or commercial use. Your plan sounds great, thanks!

yifei-men commented 8 years ago

the DNAnexus wrappers can omit setting NOVOALIGN_PATH, which will cause viral-ngs to install the conda version of novoalign (or the conda package can be installed in advance).

@tomkinsc, a couple of clarifying questions while I take a dig at this:

  1. How do we trigger viral-ngs installation of the Novocraft suite? In particular, our current approaches is to make a viral-ngs resource tarball using this script... Does the call to py.test test/unit/test_tools.py trigger installation if we don't unpack a Novocraft tarball and don't specify the NOVOALIGN_PATH?
  2. Do we need to specify any environment variables (besides NOVOALIGN_LICENSE_PATH) when triggering viral-ngs commands that under the hood uses Novoalign? (This is situations where we unpack the giant viral-ngs tarball constructed from above and not installing tools from scratch)... Should we set NOVOALIGN_PATH to the path that viral-ngs automatically installed Novocraft to (if so, do you know the path offhand)?
  3. Should the novoalign.lic file be mandatory, or does viral-ngs tolerate non-multi-threaded novoalign if the license is not provided?
dpark01 commented 8 years ago

Hi,

  1. Your current approach with py.test should work just fine, no env variables required at this time. This will result in the binary being pulled from bioconda.
  2. Just set the license path at runtime, that's it!
  3. It should be optional.
yifei-men commented 8 years ago

Thanks Danny!

yifei-men commented 8 years ago

Addressed by #30.

Ideally, we want to test this using a license file to make sure that the wrapper does things correctly and Novoalign will be able to multi-thread when a license file is given.

@tomkinsc or @dpark01 would you happen to have a license file that we can use to test? You can use any of the assembly workflows in this folder which would accept a novocraft_license file.

yifei-men commented 8 years ago

Closing with confirmation with @tomkinsc