Weekly operations Data Destruction updates and CSV file bug fix

KELSEYDOWLING7 commented 7 months ago

@jacobmpeters I updated the weekly operations report for Data Destruction changes and then fixed a bug in the csv file code. Both I was able to merge to main myself, I just need them built and tested.

The report technically has priority over the csv's.

Thank you!!

jacobmpeters commented 7 months ago

Ok! Thanks @KELSEYDOWLING7 !

I'll keep track of my ToDo's here and update as I go.

[x] Build operations container [succeeded]
[x] Test Weekly Operations Report [succeeded]
[x] Test Weekly Operations CSVs [failed w/ R error]

jacobmpeters commented 7 months ago

The weekly operations report finished successfully. I may have run it twice by mistake.

jacobmpeters commented 7 months ago

@KELSEYDOWLING7 I ran the CSVs and received this R error:

KELSEYDOWLING7 commented 7 months ago

No worries! When it runs twice but has the same name it just replaces the previous one anyways. But the tables look good.

That's so weird about the csv files. Idk if we should just put the csv file code back at the end of the report code for now? Since it worked before?

jacobmpeters commented 7 months ago

Did it work locally? You might have a package loaded in your local environment that you aren't explicitly loading in the code.. I could try it on my machine..

If you'd like to put it back at the end of the report for now, I'm willing to test it again.

jacobmpeters commented 7 months ago

It looks like you are installing dbplyr at line 42 but not loading it..

https://github.com/Analyticsphere/weekly_ccc_report_gcp_pipeline/blob/1f175657f5a9d48ff39d98a165c8f485d3171543/prod/Weekly%20Operations%20Report%20CSVs.R#L42

.. but it seems like it wasn't explicitly loaded in the Consolidated_Weekly_Report_4Pipeline_BQupdated_06082023.Rmd either..

KELSEYDOWLING7 commented 7 months ago

Huh ok well that's good to know at least. I'll remove that.

KELSEYDOWLING7 commented 7 months ago

It looks like I don't have dplyr installing or loading anywhere on the csv code on my end... I don't know if that was from an older push that didn't get removed somehow?

jacobmpeters commented 7 months ago

Ok. When I ran it locally it brok at L455

recr_resp <- tbl(con,"participants_JP") %>%

con is never defined so maybe there is a line that you forgot to copy/paste from the other report where that DBI connection is defined..

At L459 in Consolidated_Weekly..

con <- dbConnect(
  bigrquery::bigquery(),
  project = project,
  dataset = "FlatConnect",
  billing = billing
)

KELSEYDOWLING7 commented 7 months ago

Yup. You're absolutely right. That is supposed to be in the CSV code

jacobmpeters commented 7 months ago

It's a bit odd and maybe unintentional that the operations report both downloads the data and uses dbConnect .. Maybe we can clean that up at some point in the future. I know this report has switched hands so it makes sense that this has come up

KELSEYDOWLING7 commented 7 months ago

Right after

recr_count1 <- NULL for(stage in unique(recr_count$stage)){

 recr_count1  <- dplyr::filter(recr_count, stage == stage) %>%

arrange(stage,factor(d_827220437)) %>% mutate(cum.count=cumsum(count), label.count=cumsum(count)-0.1*count)

}

KELSEYDOWLING7 commented 7 months ago

I ended up changing several of the tables that used con and instead used the tibble 'data' where I could. Made more sense to me and seemed to work the same. I can try to fix the ones in the CSV only side to see if it helps

jacobmpeters commented 7 months ago

Ok. The con is beneficial because it doesn't load the data into R. Instead it uses a connection to BigQuery and performs the computation there. It can be less intuitive to use but it saves sooo much memory. But either way, we should use one method or the other.

That can be a long-term, low priority thing though.. if you want you could just paste those missing lines in and test it again..

I think when you test reports locally it's important to clear your environment first to make sure that you don't have any libraries or variables or connections loaded from previous runs of other reports.

Let me know if you can push a fix for the CSV report.

KELSEYDOWLING7 commented 7 months ago

Will open up a new issue for this week's Automation Update

Analyticsphere / analyticsPiplelines

Weekly operations Data Destruction updates and CSV file bug fix #3