BigelowLab / edna-dada2

Maine eDNA dada2
0 stars 0 forks source link

multistep ASV workflow #16

Closed btupper closed 2 years ago

btupper commented 3 years ago

multistep ASV workflow

It seems that eDNA datasets are, at least for now, mostly edge cases - that is each new sample submitted to the workflow brings unlooked-for qualities. The pipeline, in its original conception, was designed to be a simple drop-and-run process. That design makes it difficult to ascertain the needs of a particular dataset analysis before running the costly dada and and taxonomy matching steps.

To accomodate the fluidity of the eDNA datasets, we proposed to split the ASV workflow into at least 3-steps: preprocessing, user supervision, and processing.

1 Preprocess

2 User supervision

3 Process

robinsleith commented 3 years ago

I like it! So would filter and trim run in preprocess and then we point to outputs from that step in process? Assuming all looks good in user supervision?

robinsleith commented 3 years ago

Cutadapt should be the first step as all downstream steps assume primers have been trimmed off.

btupper commented 3 years ago

Preprocess through learn_errors(). Then process starting with filter_and_trim() but with option to skip over that part and go straight to run_dada().