Closed robinsleith closed 1 year ago
Oh, geez. They don't look very different from the other ones.
I'm not finding you input data. Might we make a subsample for just those two datasets?
btupper@cfe1 ~ $ cd $EDNA/dada/projects/robin_foo/mann
btupper@cfe1 mann $ ll
total 60
-rw-rw-r-- 1 rsleith edna 7979 Mar 27 14:16 asv_18S_postprocess.R
-rw-rw-r-- 1 rsleith edna 1275 Mar 27 14:16 asv_18S_postprocess.yaml
-rw-rw-r-- 1 rsleith edna 9491 Mar 27 14:16 asv_18S_preprocess.R
-rw-rw-r-- 1 rsleith edna 1752 Mar 27 14:53 asv_18S_preprocess.yaml
drwxrwsr-x 4 rsleith edna 4096 Aug 16 2022 aug16_process
drwxrwsr-x 4 rsleith edna 4096 Mar 27 15:21 auto_process
-rw-rw-r-- 1 rsleith edna 393 Aug 16 2022 job_submission.sh
drwxrwsr-x 4 rsleith edna 4096 Jul 5 2022 jul5_process
drwxrwsr-x 3 rsleith edna 16384 Mar 27 15:28 raw_reads
Never mind - you put them in raw_reads. Got it.
Any time this week to look this over?
Sure - how does tomorrow (Tues) morning look?
On Mon, Apr 3, 2023 at 10:03 AM Robin S. Sleith @.***> wrote:
Any time this week to look this over?
— Reply to this email directly, view it on GitHub https://github.com/BigelowLab/edna-dada2/issues/30#issuecomment-1494383869, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABG4SP4XFBMIRTMVGVYDZ7TW7LKE5ANCNFSM6AAAAAAWJSFAPE . You are receiving this because you commented.Message ID: @.***>
I have something at 10am, can we start at 9ish?
Works for me
On Mon, Apr 3, 2023 at 10:16 AM Robin S. Sleith @.***> wrote:
I have something at 10am, can we start at 9ish?
— Reply to this email directly, view it on GitHub https://github.com/BigelowLab/edna-dada2/issues/30#issuecomment-1494410473, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABG4SP5MPAVPUDBQRI7O44DW7LLTNANCNFSM6AAAAAAWJSFAPE . You are receiving this because you commented.Message ID: @.***>
So, assigning cutoff to 1 turns out to be intentional (not sure if it was intentionally intentional, but self-inflicted in any case.) I increased the verbosity of quality_profile_cutoff
to reflect both inputs and outputs.
fit_ruler: implementing ruler method
filename: aug-1_S114_L001_R1_001.fastq.gz
threshold: 32.0
quantile_min: 0.80
min_fraction_above_threshold: 0.70
status: a
nrows returned: 1
Cycle returned: 268
Score returned: 32
fit_ruler: implementing ruler method
filename: feb-4_S101_L001_R1_001.fastq.gz
threshold: 32.0
quantile_min: 0.80
min_fraction_above_threshold: 0.70
status: a
nrows returned: 1
Cycle returned: 276
Score returned: 32
fit_ruler: implementing ruler method
filename: aug-1_S114_L001_R2_001.fastq.gz
threshold: 32.0
quantile_min: 0.80
min_fraction_above_threshold: 0.70
too few below min_fraction_above_threshold - cutoff assigned 1
status: a_fail
nrows returned: 1
Cycle returned: 1
Score returned: 35
fit_ruler: implementing ruler method
filename: feb-4_S101_L001_R2_001.fastq.gz
threshold: 32.0
quantile_min: 0.80
min_fraction_above_threshold: 0.70
status: a
nrows returned: 1
Cycle returned: 199
Score returned: 32
In any case, here is where we test for a condition and PASS/FAIL determines if we assign 1 or let the autodetected threshold fly.
Package updated, so be sure to reinstall before using.
I think I understand why we implemented "too few below min_fraction_above_threshold - cutoff assigned 1" and it was to deal with low quality reads and weirdly shaped functions, but these reads are pretty normal so I am stuck trying to figure out why it is assigning a cutoff of 1. Is there a way to look at the min_fraction_above_threshold values for these 2 sets of reads? I am around until 11 this morning and happy to chat!
Sure - more verbosity! I'll add it
Added min_threshold and verbosity to cutoff_params in YAML
I am having an issue I think related to this verbosity change. I am guessing it is a mismatch between R script and yaml. Working here /mnt/storage/data/edna/dada/projects/robin_foo/julia
Error in (function (x1, key, threshold = 30, quantile_min = 0.9, model = stats::as.formula("Mean ~ poly(Cycle, 2)"), :
unused argument (verbose = FALSE)
That certainly looks like the culprit. I'll take a look.
I am going to try commenting out this section https://github.com/BigelowLab/dadautils/blob/0c20ba92bafb1d157074db3676439581bee3f7af/R/quality_profile.R#L87
Welp, that didnt work, I changed things back. Ill be out for the next 1.5 hour but will take another look later.
Let's walk through a reinstall for you before we go much further. Reinstall dadautils, charlier and auntie. Use /mnt/storage/data/edna/packages/edna_user_installer.R
perhaps?
I have been running this line every time I'm trouble shooting and changing something. I am running it correctly?
Rscript /mnt/storage/data/edna/packages/edna_user_installer.R charlier auntie dadautils
And you have /mod/Bigelow/dada2 exposed? Not some other instance of R?
Yeah by purge and then module load dada2. But its possible there something going on with updates on head node vs execute node. I can troubleshoot that using an interactive session.
Oh - found it! I added verbose = FALSE
to the argument list here... https://github.com/BigelowLab/dadautils/blob/main/R/quality_profile.R#L110
Reinstall and try ~banging your head against the wall~ again?
Awesome, I will let you know how it goes. Thanks!
Is there an errant comma?
Error in parse(outFile) :
/var/tmp/pbs.1547304.cfe1/RtmpJ1lB1F/R.INSTALL1029b15f5d13f2/dadautils/R/quality_profile.R:86:34: unexpected ','
85: params = list(score = 30, model = "Mean ~ poly(Cycle, 2)", quantile_min = 0.9, min_fraction_above_threshold = 0.7, verbose = TRUE),
86: form = c("full", "reduced")[2]),
^
ERROR: unable to collate and parse R files for package ‘dadautils’
Yes - but an errant )
from maybe when you tried commenting out the verbose? Fixed - please try ~sticking fingers in fan~ again.
It worked! Thanks, closing this issue!
Miss me yet?
Just started with a new dataset and with truncLen auto I am getting reverse reads truncLen set to 1 and losing all reads. Is this an issue with the line intersecting at two points? Troubleshooting read examples are here /mnt/storage/data/edna/dada/projects/robin_foo/mann/
quality profiles for all attached but see aug-1 and feb-4 for the reads I isolated
truncLen.csv output:
filterandtrim.csv output:
quality_profiles.pdf