Open hillarymarler opened 1 week ago
Should we limit the columns that can be used for id_cols?
A limitation on column arguments for id_cols could be good to consider.
Just a thought process for this function: If a different column, like the monitoring location name, is being compared, and not two characteristics (like for this example when just TOTAL PHOSPHORUS, MIXED FORMS_UNFILTERED_AS P_UG/L is being compared for two monitoring location, would the function TADA_TwoCharacteristicScatterplot be a bit 'misleading' as it's only a single characteristic? Since both y axis are based on the same characteristic, wouldn't a single y column be sufficient if there's only one characteristic, or should there be considerations on keeping the same scale when it is the same characteristic (ex. 0 to 100 MG/L for both y-axis 1 and y-axis 2?).
Above is the view on the legend labels when id_cols are for monitoring location names to try to address this issue.
Below is an example of a scatterplot I modified for a demo. See (https://usepa.github.io/EPATADA/articles/TADAWaterSciConWorkshopDemo.html) if you want to see how to create the data set used in the example below.
Maybe it would be possible to conditionally remove the 2nd y-axis if the same characteristic is plotted in both traces? But base the scale on both traces?
What do you think of changing the name to TADA_TwoGroupScatterplot? That might be more descriptive if we are making it flexible enough to accommodate different id_cols inputs.
Or is this starting to get so convoluted that it may make sense to create a separate function for comparing different locations rather than different characteristics? We could discuss tomorrow.
`# create two characteristic scatterplot using TADA_TWoCharacteristicScatterplot twochar_scatter <- TADA_TwoCharacteristicScatterplot(data %>% dplyr::filter(ActivityStartDate > "2014-12-31", TADA.ComparableDataIdentifier == "SPECIFIC CONDUCTANCE_TOTAL_NA_US/CM @25C"), id_cols = "ATTAINS.assessmentunitname", groups = c("San Juan River (Navajo bnd at Hogback to Animas River)", "Animas River (San Juan River to Estes Arroyo)")) %>%
plotly::layout(yaxis2 = list(overlaying = "y", side = "right", title = "", visible = FALSE), title = TADA_InsertBreaks("SPECIFIC CONDUCTANCE for the San Juan and Animas Rivers Over Time"))
twochar_scatter <- TADA_TwoCharacteristicScatterplot(data %>% dplyr::filter(ActivityStartDate > "2014-12-31", TADA.ComparableDataIdentifier == "SPECIFIC CONDUCTANCE_TOTAL_NA_US/CM @25C"), id_cols = "ATTAINS.assessmentunitname", groups = c("San Juan River (Navajo bnd at Hogback to Animas River)", "Animas River (San Juan River to Estes Arroyo)")) %>%
plotly::layout(yaxis2 = list(overlaying = "y", side = "right", title = "", visible = FALSE), title = TADA_InsertBreaks("SPECIFIC CONDUCTANCE for the San Juan and Animas Rivers Over Time"))`
Describe the bug
TADA.ComparableDataIdentifier is used for the legend labels in TADA_TwoCharacteristicScatterplot even when a different column has been selected for id_cols. This is confusing when comparing the same characteristic between two monitoring locations, for example:
To Reproduce
Expected behavior
Labels for the legend should reflect the column selected in id_cols.
Bug fixes should include all the following work:
[ ] Create or edit the function/code.
[ ] Document all code using line/inline and/or multi-line/block comments to describe what is does.
[ ] Create or edit tests in tests/testthat folder to help prevent and/or troubleshoot potential future issues.
[ ] Create or edit the function documentation. Include working examples.
[ ] Update or add the new functionality to the appropriate vignette (or create new one).
[ ] If function/code edits made as part of this issue impact other functions in the package or functionality in the shiny app, ensure those are updated as well.
[ ] Run TADA_UpdateAllRefs(), TADA_UpdateExampleData(), styler::style_pkg(), devtools::document(), and devtools::check() and address any new notes or issues before creating a pull request.
[ ] Run more robust check for releases: devtools::check(manual = TRUE, remote = TRUE, incoming = TRUE)