HARPgroup / HARParchive

This repo houses HARP code development items, resources, and intermediate work products.
1 stars 0 forks source link

Week of 7/15/2024 #1309

Open rburghol opened 1 month ago

rburghol commented 1 month ago
mwdunlap2004 commented 1 month ago

I added a couple of files to the master branch, the one called mon_lm_analysis.r is a way to use the new mon_lm functions I made to write out a csv of the stats which can then be used in the plot_save file I added, it takes the stats, dataset name, label name, and the write location to make a png. The updated mon_lm_stats and mon_lm_plot functions were both included in the lm_analysis_plots_copy.R file I added as well. They work with daily data as well, right now I have their source set as the original file so they wont work, but I figured it would be better for us to edit the original if we like the edits then to try and set everything to this new location

mwdunlap2004 commented 1 month ago

I ran all of the methods for all three datasets for the 01665500 gage which is the Rapidan River, there were a few issues with my methods (we didn't have the week column in nldas2), and I had to adjust my calls for the functions because it wasn't calling the most up to date versions. But our methods work, and I was able to make plots and stat csvs for all three datasets relatively easily.

mwdunlap2004 commented 1 month ago
Screenshot 2024-07-17 at 11 29 39 AM Screenshot 2024-07-17 at 11 30 01 AM Screenshot 2024-07-17 at 11 31 51 AM

This is what the error looks like on my end when I try to convert data_lm into a JSON

COBrogan commented 1 month ago

Okay. I'm guessing that error results from trying to export the R6 class that we created as plotBin. Per the documentation for toJson:

Description Convert an R object into a corresponding JSON object Lists with unnamed componenets are not current supported Usage toJson( x, indent=0, method="C" ) Arguments x a vector or list to convert into a JSON object

So, we can probably only export the lists within the object. We could restructure plotBin at this point and just make it a list. It no longer needs the full R6 functionality because it is only storing data and not the plot. This would likely get around this issue for us.

COBrogan commented 1 month ago

@mwdunlap2004 per our discussion, check out this example for lists. I was indexing my list incorrectly during the meeting. Note the use of [[i]] instead of [i] when trying to get a list element! Maybe this will help you, maybe not. As long as we get the residual plot at the end of the day, feel free to write out the data using any format you want. The below loop will generate unique data in each loop using rnorm. It will then store the full lm model, the full data, the rsq, and the stats all in testList!

testList <- list(lms = list(),stats=list(),rsq = numeric(),data=lists())
for(i in 1:12){
  testDF <- data.frame(1:50,rnorm(50))
  testLM <- lm(testDF$rnorm.50.~testDF$X1.50)
  testStats <- summary(testLM)
  rsq <- testStats$adj.r.squared

  testList$lms[[i]] <- testLM
  testList$stats[[i]] <- testStats
  testList$rsq[i] <- rsq
  testList$data[[i]] <- testDF
}

testList$data[[1]]
testList$rsq
testList$stats
class(testList$lms[[1]])

HOWEVER, this STILL can't be written to json. Apparently I was at least partially incorrect before. The lm objects themselves are R6 objects, as are summary(lm). So, these can't be written to JSON directly....Instead you may need to just store the data we want and write it out. See below for how json may save some time comapred to write.csv():

testList <- list(resid = list(),
                 fitted = list(),
                 coeff=list(),rsq = numeric(),data=list())
for(i in 1:12){
  testDF <- data.frame(1:50,rnorm(50))
  testLM <- lm(testDF$rnorm.50.~testDF$X1.50)
  testStats <- summary(testLM)
  rsq <- testStats$adj.r.squared

  testList$resid[[i]] <- testLM$residuals
  testList$fitted[[i]] <- testLM$fitted.values
  testList$coeff[[i]] <- testStats$coefficients
  testList$rsq[i] <- rsq
  testList$data[[i]] <- testDF
}

json <- toJSON(testList$stats)
mwdunlap2004 commented 1 month ago

I figured out a way with the assistance of Connor to adjust mon_lm to create a json, right now the only issue on my end is trying to get the month and rsq list to make our rsq plots we use.

Screenshot 2024-07-18 at 3 40 11 PM Screenshot 2024-07-18 at 3 39 48 PM
rburghol commented 1 month ago

@COBrogan @mwdunlap2004 It looks like the jsonlite module will serialize R6 objects. https://rdrr.io/cran/jsonlite/man/serializeJSON.html

mwdunlap2004 commented 1 month ago

That worked! I pushed the changes to harp archive, but our mon_lm_analysis now outputs our full JSON and the csv of our stats. I'm not sure if at a later date we would want to get the stats from the JSON, but right now it just outputs both since that seemed easier.

COBrogan commented 1 month ago

@mwdunlap2004 @rburghol @ilonah22 I know we said we should just move on from writing out the R6 plotBin object, but it was really irking me. So I found an approach via serialization. I will warn everyone, the file itself is pretty ugly. It's far from "pretty" JSON. It's just row after row of bytes ("raw" class in R). But it works! It let's us write out the entire R6 object and read it back in! And it doesn't use any packages. Food for thought! Stolen mostly from here

#Dummy data
test <- data.frame(a=1:3,b=4:6)
#LM for the dummy data
testLM <- lm(b ~ a,data = test)
#Our R6 Class plotBin
plotBin <- R6Class(
     "plotBin", 
     public = list(
         plot = NULL, data=list(), atts=list(), r_col='',
         initialize = function(plot = NULL, data = list()){ 
             self.plot = plot; self.data=data; 
           }
       )
   )
#Pulled from lm_analysis_plots: populate a new plotBin R6 with data. Add a list
#for lms and put a lm in there
sample_data <- test
plot_out <- plotBin$new(data = sample_data)
plot_out$atts$lms <- list()
#Store a few regressions using dummy data
plot_out$atts$lms[[1]] <- lm(b ~ a, data = test)
plot_out$atts$lms[[2]] <- lm(a ~ b, data = test)

#A file path to write out the file
fname <- "testser.txt"
#Opens a connection to the file path and writes the data directly. Simplifies
#the formatting for these "raw" bytes that we will create
outCon <- file(fname, "w")
#Serialize plot_out as ascii bytes and convert to Char (character).
mychars <- rawToChar(serialize(plot_out, NULL, ascii=T))
# Write directly to the file using the connection outCon from above
cat(mychars, file=outCon)
#Close outCon, essentially saving the file in fname
close(outCon)

#Now, read in fname via readChar. Then convert to raw, which is expected by
#unserialize
testUnser <- charToRaw(readChar(fname, file.info(fname)$size))
#Voila, our R6 object is here!
unserializedData <- unserialize(testUnser)
unserializedData$atts$lms

Outputs:

> unserializedData
<plotBin>
  Public:
    atts: list
    clone: function (deep = FALSE) 
    data: list
    initialize: function (plot = NULL, data = list()) 
    plot: NULL
    r_col: 
> unserializedData$atts$lms
[[1]]

Call:
lm(formula = b ~ a, data = test)

Coefficients:
(Intercept)            a  
          3            1  

[[2]]

Call:
lm(formula = a ~ b, data = test)

Coefficients:
(Intercept)            b  
         -3            1  
mwdunlap2004 commented 4 weeks ago

I changed our mon_lm_analysis function to include Connor's method for creating the JSON, it is super ugly like he said it would be, but reading it in to the residual plot function I made seems to work, this is what the code looks like, I think variable names could be improved, but the code works and we can still output the stats and the json right now.

Screenshot 2024-07-29 at 9 09 06 AM