SRJPE / grunID

https://srjpe.github.io/grunID/
Creative Commons Zero v1.0 Universal
2 stars 0 forks source link

add support for external samples #128

Open ergz opened 11 months ago

ergz commented 11 months ago

latest main branch on run-id-database has added tables to support this

ergz commented 10 months ago

From Sarah Brown

Hi All-

Below are examples of the salvage ID.

C230086SWP

C230015CVP

C23 = Chinook and year collected

0086 = Sample number (for state water project)

0015 = Sample number (for central valley project)

SWP or CVP = State water project or Central Valley project

ergz commented 8 months ago

completed, just needs to be merged and a few changes to the UI fo confirm things looks good, we are also waiting for some test data to run through the system

Talitrus commented 6 months ago

Maybe I'm jumping the gun here, but I saw that the selector for JPE vs salvage is available now for grunID. I tried uploading an Excel file to it, but received some errors from grunID trying to read in rows that were not RFU data. I have attached here an example file here of salvage data that I believe should be properly formatted for grunID for testing purposes. 20240411_SAL24_EL_BN.xlsx

Settings: Sample ID type: salvage Protocol: 384-1 Laboratory: DWR_GeM Genetic method: SHLK Sample type: fin clip Layout type: Split Plate - Early + Late Plate size: 384 Run genetic calculations for sample after upload: TRUE

Output:

ℹ Adding plate run to database ✔ Plate run added to database with id = 50 ℹ Processing sherlock data Warning: Expecting numeric in C144 / R144C3: got a date Warning: Expecting numeric in C145 / R145C3: got a date Warning: Expecting numeric in C152 / R152C3: got a date Warning: Expecting numeric in C153 / R153C3: got a date Warning: Expecting numeric in C160 / R160C3: got a date Warning: Expecting numeric in C161 / R161C3: got a date Warning: Expecting numeric in C168 / R168C3: got a date Warning: Expecting numeric in C169 / R169C3: got a date Warning: Expecting numeric in C176 / R176C3: got a date Warning: Expecting numeric in C177 / R177C3: got a date ✔ Sherlock results processing complete ℹ adding results to database ✖ there was an error attempting to add new raw data, removing plate run associated with this from database, see the error below for more details: [ ERROR] 2024-04-17 10:43:13.773607 Error attempting insert data into raw_assay_result table, the sample C240869SWP does not exists in the sample table

ergz commented 6 months ago

I see that the samples included in this analysis are of the form: "C241001SVP", is this a typo or is SVP an expected sample along with: C23XXXXSWP, C23XXXXCVP

Talitrus commented 6 months ago

This is a typo; it should be C241001SWP. Thanks!

On Mon, Apr 22, 2024 at 12:54 PM Emanuel Rodriguez @.***> wrote:

I see that the samples included in this analysis are of the form: "C241001SVP", is this a typo or is SVP an expected sample along with: C23XXXXSWP, C23XXXXCVP

— Reply to this email directly, view it on GitHub https://github.com/SRJPE/grunID/issues/128#issuecomment-2070840684, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACFSSAN6AYPBWIO3ZVAZLMDY6VTHPAVCNFSM6AAAAAA776MJ5KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZQHA2DANRYGQ . You are receiving this because you commented.Message ID: @.***>

ergz commented 6 months ago

sounds good, I am still seeing a parsing error after upload, so will work on that and let you know to try again

Talitrus commented 6 months ago

Thanks! Judging by the warnings/errors I was getting, I think some of that may be because JPE plates are typically completely full plates, while salvage plates can often be half-empty, leading to the program expecting more data where there is none or where the Excel file contains other metadata.

On Mon, Apr 22, 2024 at 1:47 PM Emanuel Rodriguez @.***> wrote:

sounds good, I am still seeing a parsing error after upload, so will work on that and let you know to try again

— Reply to this email directly, view it on GitHub https://github.com/SRJPE/grunID/issues/128#issuecomment-2070925171, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACFSSALIO5XUXYBQW27G5ADY6VZORAVCNFSM6AAAAAA776MJ5KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZQHEZDKMJXGE . You are receiving this because you commented.Message ID: @.***>

ergz commented 6 months ago

looks like the error is that by default I am looking for the extraction blank samples to generate the threshold:

generate_threshold <- function(con, plate_run, results_table, strategy = "twice average", .control_id="EBK") {

  if (!DBI::dbIsValid(con)) {
    stop("Connection argument does not have a valid connection the run-id database.
         Please try reconnecting to the database using 'DBI::dbConnect'",
         call. = FALSE)
  }

  plate_run_identifier <- plate_run$plate_run_id
  ...
  ...
  ...

by default I use .control_id="EBK" but we can switch that to anything, what should this value for the salvage samples, I am guessing the NTC values?

Talitrus commented 6 months ago

NTCs would probably be the analog here.

On Mon, Apr 22, 2024 at 1:59 PM Emanuel Rodriguez @.***> wrote:

looks like the error is that by default I am looking for the extraction blank samples to generate the threshold:

generate_threshold <- function(con, plate_run, results_table, strategy = "twice average", .control_id="EBK") {

if (!DBI::dbIsValid(con)) { stop("Connection argument does not have a valid connection the run-id database. Please try reconnecting to the database using 'DBI::dbConnect'", call. = FALSE) }

plate_run_identifier <- plate_run$plate_run_id ... ... ...

by default I use .control_id="EBK" but we can switch that to anything, what should this value for the salvage samples, I am guessing the NTC values?

— Reply to this email directly, view it on GitHub https://github.com/SRJPE/grunID/issues/128#issuecomment-2070942607, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACFSSAJQXDUSTV2WTXBDT53Y6V23HAVCNFSM6AAAAAA776MJ5KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZQHE2DENRQG4 . You are receiving this because you commented.Message ID: @.***>

ergz commented 6 months ago

Ok, pushed updates to let you select between EBK and NTC for calculating the threshold, this however does trigger one of the Qa/qc rules in place for JPE samples. Let me know once you've tried to upload if this is correct, otherwise we should open an issue and update rules to represent salvage samples

Talitrus commented 6 months ago

Thanks! I can give it another test run on Friday or so. Salvage may have EBK samples at some point in the future, too, in addition to NTCs, so it's great to be able to select which to use.

Talitrus commented 6 months ago

I tried using the file provided above: https://github.com/SRJPE/grunID/files/15015178/20240411_SAL24_EL_BN.xlsx

Received this error message:

Error in value[3L]: Failed to initialise COPY : ERROR: permission denied for table external_raw_assay_result

ergz commented 6 months ago

looks I have to update the permissions for this table on the database, just ran

GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO "Bryan.Nguyen@water.ca.gov";

can you check if its working now?

Talitrus commented 6 months ago

I now get a different error message:

✔ Plate run added to database with id = 53 ℹ Processing sherlock data Warning: Expecting numeric in C144 / R144C3: got a date Warning: Expecting numeric in C145 / R145C3: got a date Warning: Expecting numeric in C152 / R152C3: got a date Warning: Expecting numeric in C153 / R153C3: got a date Warning: Expecting numeric in C160 / R160C3: got a date Warning: Expecting numeric in C161 / R161C3: got a date Warning: Expecting numeric in C168 / R168C3: got a date Warning: Expecting numeric in C169 / R169C3: got a date Warning: Expecting numeric in C176 / R176C3: got a date Warning: Expecting numeric in C177 / R177C3: got a date ✔ Sherlock results processing complete ℹ adding results to database ✖ there was an error attempting to add new raw data, removing plate run associated with this from database, see the error below for more details:

Error in value[3L]: COPY returned error : ERROR: insert or update on table "external_raw_assay_result" violates foreign key constraint "external_raw_assay_result_sample_id_fkey" DETAIL: Key (sample_id)=(C241046SWP) is not present in table "external_sample".

ergz commented 6 months ago

ok this makes sense, the database was seeded with just 0 to 1000 samples this is sample 1046, will update to include up to 2000, what do you think is a reasonable amount of samples?

Talitrus commented 6 months ago

We're already up to about 2000, last year went up to about 5000. Perhaps we could put in 9000?

ergz commented 6 months ago

staging database updated to include samples 0000-9000 for each of SWP and CVP

Talitrus commented 5 months ago

I got past the previous error, but now receive this:

Test Not Passed: 2 of 3 Positive DNA were not above threshold for plate with id: '57' on assay: '1'

Salvage assays may only have one positive control replicate.

Talitrus commented 3 months ago

Hey @ergz , just wanted to check in on if you were able to make modifications to address the control sample counts being under 3.

Thanks!