BigelowLab / edna-dada2

Maine eDNA dada2
0 stars 0 forks source link

Where/when to print maxEE information (before filter_and_trim) #6

Closed robinsleith closed 3 years ago

btupper commented 3 years ago

Produce and document the maxEE report from the workflow.

paired_ee_threshold gains a form argument which could be 'list'(2 elements with tibble for forward and for reverse) or 'table' which merges the two into one tibble.

paired_ee_threshold gains a sample_names argument to augment tables with sample.names.

paried_ee_threshold gains a filename argument, which if not NULL causes output to t a CSV file.

btupper commented 3 years ago
example_filepairs() %>%
    paired_quality_scores() %>%
   paired_ee_per_read() %>%
    paired_ee_threshold(sample_names  = c("foo", "bar"), filename = "~/my_ee_stuff.csv")

  ... (verbose output from Rsubread::qualityScores())

# A tibble: 4 x 8
  direction sample file              t_1   t_2   t_3   t_4   t_5
  <chr>     <chr>  <chr>           <dbl> <dbl> <dbl> <dbl> <dbl>
1 forward   foo    sam1F.fastq.gz 0.527  0.683 0.775 0.829 0.865
2 forward   bar    sam2F.fastq.gz 0.511  0.673 0.754 0.812 0.863
3 reverse   foo    sam1R.fastq.gz 0.0653 0.234 0.401 0.536 0.662
4 reverse   bar    sam2R.fastq.gz 0.0513 0.209 0.387 0.519 0.644
robinsleith commented 3 years ago

added the following to the workflow

paired_quality_scores(fq_files) %>%
   paired_ee_per_read() %>%
    paired_ee_threshold(sample_names  = sample.names, filename = file.path(CFG$output_path, "EE_thresholds.csv"))