USEPA / CompTox-ToxCast-tcpl

US EPA's Toxicity Forecaster (ToxCast) Pipeline. More information on the ToxCast program available here: https://www.epa.gov/comptox-tools/toxicity-forecasting-toxcast
https://cran.r-project.org/package=tcpl
Other
28 stars 12 forks source link

How to produce dose-response curves using db 4.1? #215

Open ldecicco-USGS opened 5 months ago

ldecicco-USGS commented 5 months ago

I updated my ToxCast database to prod_internal_invitrodb_v4_1. I'm using version 3.1.0 of tcpl.

When I was using ToxCast 3.5, the following script would produce a plot:

ep <- "NVS_ENZ_hPDE4A1"
cas <- "1912-24-9"

chem_info <- tcplLoadChem(field = 'casn', val = cas)

assay_info <- tcplLoadAcid(fld = "acnm", val = ep)

mc3 <- tcplLoadData(lvl = 3, type = "mc", 
                    fld = c("acid","spid"), 
                    val = list(assay_info$acid,
                               chem_info$spid))
mc4 <- tcplLoadData(lvl = 4, type = "mc", 
                    fld = c("spid", "aeid"), 
                    val = list(chem_info$spid,
                               unique(mc3$aeid)))

tcplPlotM4ID(mc4, lvl = 5)
Error in (function (fmt, ...)  : only 100 arguments are allowed

# Also tried:
tcplPlotM4ID(m4id = mc4$m4id, lvl = 4)
Error in if (!is.na(pars$cnst) & pars$cnst) { : 
  argument is of length zero

Is there a new way to do this with the new database structure?

madison-feshuk commented 5 months ago

Hi Laura, the several tcpl functions used in the past to produce the different plotting outputs have now been consolidated into tcplPlot(). Check out some plotting examples and documentation in the data retrieval vignette: https://cran.r-project.org/web/packages/tcpl/vignettes/Data_retrieval.html#plotting. Once you have the desired mc5, you can use tcplPlot to plot by spid/aeid or m4id. Hope this helps!

ep <- "NVS_ENZ_hPDE4A1" cas <- "1912-24-9" chem_info <- tcplLoadChem(field = 'casn', val = cas) assay_info <- tcplLoadAeid(fld = "acnm", val = ep) mc5 <- tcplLoadData(lvl = 5, type = "mc", fld = c("aeid","spid"), val = list(assay_info$aeid, chem_info$spid)) tcplPlot(lvl = 5, fld = c("spid","aeid"), # fields to query on val = list( # value for each field, must be same order as 'fld' mc5$spid, # sample id's mc5$aeid # assay endpoint id's ), by = "aeid", # parameter to divide files multi = TRUE, # multiple plots per page - output 6 per page if TRUE verbose = TRUE, # output all details if TRUE output = "pdf", # output as pdf fileprefix = "output/upitt") # prefix of the filename

brown-jason commented 5 months ago

This makes me think that we need to deprecate these old plotting functions with a explanation to switch to the new tcplPlot functionality. They still work for the old schemas but it seems like most people are switching to the 4.1 or greater invitrodb

ldecicco-USGS commented 5 months ago

Perfect, thanks for the tcplPlot code!

ldecicco-USGS commented 4 months ago

Revisiting this issue. I thought I had this working, but now I can't remember what I did. Is there a way to get just the ggplot2 object out of the tcplPlot function? When I run with output = "console" , the "Viewer" opens up (instead of Plots) in RStudio, but nothing appears. Ideally, I'd like to just get the ggplot object so I can play with it from there.

ep <- "NVS_ENZ_hPDE4A1"
cas <- "1912-24-9"

chem_info <- tcplLoadChem(field = 'casn', val = cas)
assay_info <- tcplLoadAeid(fld = "acnm", val = ep)

mc5 <- tcplLoadData(lvl = 5, type = "mc",
                    fld = c("aeid","spid"),
                    val = list(assay_info$aeid,
                               chem_info$spid))

plot_out <- tcplPlot(lvl = 5,
         fld = c("spid","aeid"), # fields to query on
         val = list(mc5$spid, # sample id's
                    mc5$aeid # assay endpoint id's
                    ),
         by = "aeid", # parameter to divide files
         multi = FALSE, 
        verbose = TRUE, # output all details if TRUE
        output = "console") 

plot_out

So, is there a way to set it up that plot_out is a ggplot2 object? Right now it looks like a plotly object, but it's not rendering in my viewer.

packageVersion("tcpl")
[1] ‘3.1.0’
brown-jason commented 4 months ago

@ldecicco-USGS we're investigating this issue. I think it's 2 parts

  1. I think with everyone updating to R version 4.4 plotly is broken in rstudio and
  2. we don't have an option for users to output just the ggplot2 object

I think your use case warrants creating an option in the output so you can get the ggplot2 object

ldecicco-USGS commented 4 months ago

I did just update to 4.4, so that tracks.

I would definitely vote having a ggplot output option. Plotly's nice for some things, but being able to take the ggplot output and customize it is something I would do (maybe just change a theme or add an annotation or who knows what)

brown-jason commented 4 months ago

Agreed, we will use this ticket to add ggplot output as an option.

FYI for the 4.4 issues. https://github.com/rstudio/rstudio/issues/14603

ldecicco-USGS commented 1 month ago

Any updates to getting ggplot2 outputs? I'd really like to include the ACC value on the plots or make other customizations.

cthunes commented 1 month ago

Hi @ldecicco-USGS, thank you for the suggestion! The requested changes are now in a pull request (#275) and available for testing out on this branch "215-add-ggplot-output-option-to-tcplPlot".

To work with the ggplot output, simply set output = "ggplot" and save the result to your environment.

brown-jason commented 2 weeks ago

@ldecicco-USGS have you been able to successfully use this branch?

ldecicco-USGS commented 2 weeks ago

Yup, looks good!

plot_out <- tcplPlot(#lvl = 5,
         fld = c("spid", "aeid"),#c("spid","aeid"), # fields to query on
         val = list( # value for each field, must be same order as 'fld'
           mc5$spid, # sample id's
           mc5$aeid # assay endpoint id's
         ),
         by = "aeid", # parameter to divide files
         multi = TRUE, # multiple plots per page - output 6 per page if TRUE
         verbose = FALSE, # output all details if TRUE
         output = "ggplot")

plot_out + ggtitle("My new title")

image