mhoban / rainbow_bridge

GNU General Public License v3.0
5 stars 2 forks source link

running insect after initial run through to LCA taxonomy does not use cached blast and lulu #39

Closed cajwalsh closed 7 months ago

cajwalsh commented 11 months ago

When I run eDNAFlow, I run the LCA taxonomy assignment using 4 different sets of values on 4 separate runs. After the first run, everything except the new LCA run with different parameters is found in the cache so that only the new LCA process is run. When I tried to do a fifth run using insect but with the otherwise exact same code, it used the cache for everything up to blast and lulu which it tried to run again.

Two workarounds for this are to:

  1. if you know this is going to happen before you do it, use --skip-blast and --skip-lulu
  2. if you have already tried the normal way, find your old blast results file and do a --standalone-taxonomy run using old blast results (either still symlinked in your results folder or in the work directory) and the cached zotu_table in the results directory that shouldn't have changed.

A few other minor things I noticed during this process:

mhoban commented 8 months ago

Still looking into this, but noting that the comments about needing to pass run type and the readme typo were broken off into #43 and #44, and fixed.

mhoban commented 8 months ago

@cajwalsh and @vwishingrad see #45 for a question about output folder/process numbering

mhoban commented 7 months ago

Thanks to the ability of nextflow to spit out a visual representation of its dependency graph, I think I have resolved this. There were some weirdly circular dependencies which should no longer exist. This is done in fd36970.