Open dfornika opened 9 months ago
@dfornika Any chance you can attach some example files:
I can generate some on Monday, but I think any normal inputs, along with a metadata.csv file that has all NA
values for date
and ct
should trigger the error.
Thanks @dfornika. Unfortunately I no longer have access to prior data as I have moved on from my role when I originally helped on the project.
Ok no problem, I'll dig something up as soon as I can.
I had to give them .txt
extensions so they'd be accepted here for upload.
I grabbed a couple of SRA samples, downsampled them to ~25x coverage and ran them through our artic pipeline, then through ncov-tools v1.9.1.
I might need to follow up with the merged.qc.csv
file. I think that file may be generated downstream of the error we're seeing.
metadata.txt SRR27382851-1-25x.amplicon_depth.txt SRR27382851-1-25x.per_base_coverage.txt SRR27382852-1-25x.amplicon_depth.txt SRR27382852-1-25x.per_base_coverage.txt
Hmm I don't see a merged.qc.csv
in our outputs. It looks like that's being generated from the artic qc.csv
file(?) I don't think we provide that as an input.
We've encountered this bug before and the way we got past it was just to add in one integer value to the ct
column, usually for a control sample. Not the best method but it gets past the error
From our docs:
If there are any blank cells in the metadata table, replace them with an NA value. If the entire
ct
column consists of NA, this will cause problems for ncov-tools. In this case, either try to pull the correct Ct values from the database or use a placeholder Ct value such as 0 or 100
That sounds like a reasonable work-around. On our side, we discovered that there was an upstream issue in our process that pulls metadata from our database for use in this pipeline that was causing us to miss most of our Ct and collection date values for some recent runs. We've got that issue fixed now, so we're less likely to run into this in the near future.
But the fix seems fairly simple so it would be nice to have the issue resolved so it won't arise again, if possible.
@dfornika @DarianHole I created a branch which automatically addresses the metadata ct issue. When the metadata is imported, if all the CT values are NA, it automatically fills them with 0: https://github.com/jts/ncov-tools/blob/3028cc610830c13e0db1a1bce464e0feeb382fdf/workflow/scripts/plot/plot_qc_sequencing.R#L125
I also addressed issue #110 and added N
as a key in the base dictionary:
https://github.com/jts/ncov-tools/blob/3028cc610830c13e0db1a1bce464e0feeb382fdf/workflow/scripts/format_pileup.py#L15
I noticed you addressed it on your end, but hopefully this helps. Let me know if it works out and I can merge the code and do a release.
We have an automated system that pulls metadata (collection dates and Ct values) for all samples prior to running artic & ncov-tools. We've recently run into a situation where no samples had metadata available (all
NA
values fordate
andct
in the metadata file).The error we've run into is occuring in the
plot_qc_sequencing.R
script, where themerged
dataframe here::https://github.com/jts/ncov-tools/blob/7a19778c644594fe17ce4fc703560e97b149aa60/workflow/scripts/plot/plot_qc_sequencing.R#L5
...is empty, resulting in this error message:
We haven't confirmed, but a similar error may arise in the other plotting functions in the script that use the metadata.