Analyticsphere / analyticsPiplelines

0 stars 0 forks source link

Automation Update 8/8 #12

Closed KELSEYDOWLING7 closed 4 weeks ago

KELSEYDOWLING7 commented 2 months ago

Hi Jake, these weekly reports are currently failing in GCP in order of priority

All of these run locally for me, so anything you can see on GCP would be helpful to know. I'll have any PR's done before our meeting.

KELSEYDOWLING7 commented 2 months ago

PRs completed for both repos!

jacobmpeters commented 2 months ago

All of these are built, so I'll begin testing now..

KELSEYDOWLING7 commented 2 months ago

Great!

jacobmpeters commented 2 months ago

The Weekly Biospecimen Report failed with this R error:

Screenshot 2024-08-08 at 4 06 24 PM
jacobmpeters commented 2 months ago

The Module Metrics with the same "col" vs. "column" issue that we saw last time.

Screenshot 2024-08-08 at 4 09 23 PM
jacobmpeters commented 2 months ago

The Module Metrics with the same "col" vs. "column" issue that we saw last time.

Screenshot 2024-08-08 at 4 09 23 PM

I fixed this myself and I'll test it again...

KELSEYDOWLING7 commented 2 months ago

Oh, thank you! I'll have to update all other reports to have column instead of col. That must be be a GCP update

KELSEYDOWLING7 commented 2 months ago

Table 7.3 in the Biospecimen report keeps throwing me off because its not an issue locally

jacobmpeters commented 2 months ago

Oh, thank you! I'll have to update all other reports to have column instead of col. That must be be a GCP update

Yeah I don't understand why this is suddenly causing problems. I did not change the version of tbl_cross in the container

KELSEYDOWLING7 commented 2 months ago

Its a quick fix regardless. I'll be pushing a new Weekly Biospecimen Report PR in just a minute

jacobmpeters commented 2 months ago

I made the fix to the Weekly Module Metrics and tested it, but I received this error. It seems like the opposite error of what we received before.

Screenshot 2024-08-08 at 4 42 14 PM
jacobmpeters commented 2 months ago

We're still getting this error with the Biospecimen report..

Screenshot 2024-08-08 at 4 59 49 PM
KELSEYDOWLING7 commented 2 months ago

Jake it looks like one of the Biospecimen QC csvs didn't run. The Duplicates report csv should have been populated. It populated when I ran it locally

KELSEYDOWLING7 commented 1 month ago

For the follow up his week:

jacobmpeters commented 1 month ago

We got a tbl_cross error in the Biospecimen Report.

Screenshot 2024-08-15 at 3 16 28 PM
KELSEYDOWLING7 commented 1 month ago

Ok I'll fix that now and make sure there aren't any more of those instances. I thought I had caught them all... I hope its not a merge conflict

KELSEYDOWLING7 commented 1 month ago

It's definitely a merge conflict. They're all col = in the tbl cross functions on my local drive but somehow pushing that to Github didn't change them... I work on a new PR and try manually to change them all again if need be

jacobmpeters commented 1 month ago

Ok! That's definitely annoying. I'll wait for your PR and work on the CSVs..

jacobmpeters commented 1 month ago

Seems like we're still getting the same error:

Screenshot 2024-08-15 at 4 21 10 PM
KELSEYDOWLING7 commented 1 month ago

Huh I really don't know how thats possible... the main branch has the correct code that doesn't use 'column': tab___1 <- table1 %>%
mutate(biocol.appoint = factor(biocol.appoint, levels=c("Baseline Collection","No Baseline Collection"))) %>% dplyr::select(Site, biocol.appoint ) %>% tbl_cross( row = Site, col = biocol.appoint, digits=c(0,1), percent = "row", label=list(Site="Site", biocol.appoint = " "), missing="ifany", margin_text="Total")

jacobmpeters commented 1 month ago

Hmm.. let me take a look. It might not have built completely before I kicked off the test run.. I'll get back to you on that..

jacobmpeters commented 1 month ago

The Biospecimen QC ran and the .xlsx file should have delivered to box.. Can you check?

KELSEYDOWLING7 commented 1 month ago

Yes, those look perfect!! That one in good to set to complete

jacobmpeters commented 1 month ago

The Biospecimen CSVs Report script is trying to use a local_drive variable that isn't defined..

Screenshot 2024-08-15 at 4 40 00 PM
KELSEYDOWLING7 commented 1 month ago

Ah ok that's my fault, I didn't include the intro in this version but "{local_drive}" can just be removed. I can do a PR for that real quick but I understand you may not have time to build and run it

jacobmpeters commented 1 month ago

I'm getting a useless error for the Biospecimen Report. Do you have any thoughts on where this might be breaking? It doesn't even get to the first chunk so I'm confused..

Screenshot 2024-08-15 at 4 49 04 PM
jacobmpeters commented 1 month ago

Did this section of the header change recently?

Screenshot 2024-08-15 at 4 52 30 PM
KELSEYDOWLING7 commented 1 month ago

Huh, that I don't know. Maybe its this part in the beginning? I think we can delete from the Report code since we no longer have the CSVs:

if (!is.null(Sys.getenv("USE_TEST_BOX_FOLDER")) && Sys.getenv("USE_TEST_BOX_FOLDER") != "") { use_test_box_folder <- as.logical(Sys.getenv("USE_TEST_BOX_FOLDER")) }

if (use_test_box_folder) { boxfolder <- 222593912729 # test box folder } else { boxfolder <- 221280601453 # destination of CSV files (not pdf) }

Making sure personal C drives aren't referenced if this code is being used by others

Change to FALSE if referencing this code

write_to_local_drive = F #F

local_drive="C:/Users/dowlingk2/Documents/Module-Missingness-and-Metrics/data/"

This function below is put before any write.csv functions, and "filename" is updated. It determines wheteher the file will be created locally or not.

filename=

local_drive= ifelse(write_to_local_drive, "C:/Users/dowlingk2/Documents/Module-Missingness-and-Metrics/data/", "")

KELSEYDOWLING7 commented 1 month ago

The yaml part was part of the weird merge conflicts. It should look like this:


title: "Biospecimen Weekly Metrics RMD" author: "Kelsey Sanchez" date: "Data Extracted and Report Ran: r Sys.Date()" header-includes:
\usepackage[labelformat=empty]{caption} \usepackage{placeins} \usepackage{booktabs} \usepackage{pdflscape}

output: pdf_document: extra_dependencies: ["float"] toc: true keep_tex: yes fig_width: 7 fig_height: 5 fig_caption: true df_print: paged

jacobmpeters commented 1 month ago

The yaml part was part of the weird merge conflicts. It should look like this:

title: "Biospecimen Weekly Metrics RMD" author: "Kelsey Sanchez" date: "Data Extracted and Report Ran: r Sys.Date()" header-includes: \usepackage[labelformat=empty]{caption} \usepackage{placeins} \usepackage{booktabs} \usepackage{pdflscape}

output:

pdf_document: extra_dependencies: ["float"] toc: true keep_tex: yes fig_width: 7 fig_height: 5 fig_caption: true df_print: paged

Ok. I think I just fixed it.. there was an missing indent

output:
  pdf_document:
    extra_dependencies: ["float"]
    toc: true
    keep_tex: yes
    fig_width: 7
    fig_height: 5
    fig_caption: true
    df_print: paged 
KELSEYDOWLING7 commented 1 month ago

Oh ok perfect!