Open BryceMcManus-NOAA opened 1 year ago
@BryceMcManus-NOAA : Looks good to me.
The mutate
line here:
count(Zone_ID) %>%
mutate(obs = n/sum(n) * 100) %>%
filter(n >= 10) %>%
This so the percentages are done on all the data in scallop0322MainDataTable
, not the filtered data, right? If so, then I like it.
Heads up 1: We may need to adjust the plots further based on reviewer/editor comments. For example, we may be asked to write out "observations" or change "CPUE (mean)" to "Average CPUE." Or not. The other thing I could see is being being asked to move the legend inside the figure -- we have alot of white space in the bottom right of the map.
Heads up 2: I think this is on your list, but fiddling with the colorization might be needed. I think we could do something like:
custom_breaks<-classInt::classify_intervals(scallop0322MainDataTable $CPUE, 5, style="jenks") #or classInt:classIntervals()
And then pass this to the zone_summary(breaks=custom_breaks). This would not work for the maps of obs or percentages, I don't think.
The
mutate
line here:count(Zone_ID) %>% mutate(obs = n/sum(n) * 100) %>% filter(n >= 10) %>%
This so the percentages are done on all the data in
scallop0322MainDataTable
, not the filtered data, right? If so, then I like it.
Right, the percentages are calculated before the filter. Then zone_summary()
simply uses those pre-calculated values in the plot.
Heads up 1: We may need to adjust the plots further based on reviewer/editor comments. For example, we may be asked to write out "observations" or change "CPUE (mean)" to "Average CPUE." Or not. The other thing I could see is being being asked to move the legend inside the figure -- we have alot of white space in the bottom right of the map.
Those changes will require me to update zone_summary()
. I'll have to add a legend title argument, legend position etc. which is pretty straightforward.
Heads up 2: I think this is on your list, but fiddling with the colorization might be needed. I think we could do something like:
custom_breaks<-classInt::classify_intervals(scallop0322MainDataTable $CPUE, 5, style="jenks") #or classInt:classIntervals()
And then pass this to the zone_summary(breaks=custom_breaks). This would not work for the maps of obs or percentages, I don't think.
This is bit tricky since the breaks are being determined before the average CPUE is calculated in your example, so the breaks will be way off. And classify_interval()
returns a character interval which will return an error since the breaks
argument in zone_summary()
requires a numeric vector. To make this work you'd have to do this:
interval_to_numeric <- function(int) {
int <- gsub("\\[|\\(|\\)|\\]", "", int) # remove brackets and parentheses
int <- strsplit(int, ",") # split by comma
int <- as.numeric(unlist(int)) # convert to numeric vector
int[!duplicated(int)] # remove duplicated breaks
}
custom_breaks <- function(var, brks = 5, style = "fisher", round = 0) {
int <- classInt::classify_intervals(var, n = brks, style = style)
int <- levels(int)
round(interval_to_numeric(int))
}
scallop0322MainDataTable %>%
add_count(Zone_ID) %>%
filter(n >= 10) %>%
group_by(ZoneID) %>%
mutate(mean_cpue = mean(CPUE)) %>%
ungroup() %>%
zone_summary(project = "scallop0322",
spat = "scallop0322TenMNSQRSpatTable",
zone.dat = "Zone_ID",
zone.spat = "MN10SQID",
output = "tab_plot",
var = "CPUE",
fun = "mean",
breaks = custom_breaks(.$mean_cpue, 5, "fisher"),
count = FALSE,
na.rm = TRUE)
## $table
## # A tibble: 263 × 2
## Zone_ID CPUE
## <chr> <dbl>
## 1 377311 1728.
## 2 377422 2060.
## 3 377423 2143.
## 4 377424 1954.
## 5 377425 1932.
## 6 377431 1842.
## 7 377432 1890.
## 8 377433 2195.
## 9 377434 1799.
## 10 377441 1871.
## # … with 253 more rows
## # ℹ Use `print(n = ...)` to see more rows
##
## $plot
This is bit tricky since the breaks are being determined before the average CPUE is calculated in your example, so the breaks will be way off. And classify_interval() returns a character interval which will return an error since the breaks argument in zone_summary() requires a numeric vector. To make this work you'd have to do this:
oh ...right.... Thanks for digging into this.
@BryceMcManus-NOAA let's shelve the breaks for now. The filter
(or mutate %>% filter
) solution should work.
This will eventually got into Dev -- but don't code it up yet, Dev is changing rapidly and I'm not confident it my ability to resolve merge conflicts.
@BryceMcManus-NOAA -- dev now works and is fairly stable. That happened faster than I expected. If you have improvements ready, please go for it. If not, no big deal.
@mle2718 Great, I can start adjusting the plots tomorrow morning.
@BryceMcManus-NOAA : one thing that I was thinking about..just for the "closure areas" is it possible to divide "total catch" by the area (in km^2) of the closure area? If it's easy, that would be a great improvement to our "this is how the call area--> wind area--> lease areas evolve" story.
Including zones that have less than 10 or so observations makes the scallop zone plots smaller, cluttered, and more difficult to understand. I think the easiest way to improve them, without changing
zone_summary()
, is to filter out zones with fewer than 10 (or possibly 5) observations before runningzone_summary()
:For the percentage plots, we'll need to add a zone percentage column as well as pass a placeholder function that returns the value it's given. It's a bit hacky but it gets the job done. Also, and this kind of a pain, for some reason the spatial plots can't be edited after they're created, unlike other ggplots. So, for example, you can't add a title or change the legend text, which is why the placeholder function is call
%
, sincezone_summary()
uses the name of the function. I haven't found a solution that doesn't involve changingzone_summary()
.Here's how the CPUE plot may look after filtering:
@mle2718 does this sound good to you?