Closed EmilyMarkowitz-NOAA closed 10 months ago
Having trouble connecting to the VPN and will revisit. At first glance, I'd recommend setting make_idw_stack(extrapolation.grid.type = "sf") or make_idw_stack(extrapolation.grid.type = "sf.simple")
Then switch from geom_stars() to geom_sf() in your plot0 function.
The breaks in the second figure look way better than the breaks from the community report and I think do a much better job of representing spatial variability. I'm not sure why it's so slow using the small breaks, but the breaks from the paper were for kg/ha, not kg/km2 so your values are off by a factor of 100.
Got my VPN working and this takes ~45 seconds:
start_time <- Sys.time()
jellyfish <- table_raw |>
dplyr::rename(LONGITUDE = longitude_dd_start,
LATITUDE = latitude_dd_start,
CPUE_KGHA = cpue_kgkm2) |>
make_idw_stack(region = "ebs",
grouping.vars = "year",
set.breaks = c(0, 1, 10, 100, 1000, max(table_raw$cpue_kgkm2)),
extrapolation.grid.type = "sf")
end_time <- Sys.time()
end_time-start_time
ggplot() +
geom_sf(data = jellyfish$extrapolation.stack,
mapping = aes(fill = var1.pred),
color = NA) +
facet_wrap(~year) +
scale_fill_viridis_d(direction = -1)
Oh! Good catch on converting to kg/ha - can't believe I missed that. Just to be sure, do you still suggest using the c(0, 1, 10, 100, 1000, max(table_raw$cpue_kgkm2))
breaks over the default breaks presented in the issue c(0, 249, 683, 1365, 2794, 7200)
? Or is your vote for the latter/default breaks?
I'd vote for c(0, 1, 10, 100, 1000, max(table_raw$cpue_kgkm2))
.
In general, I would advise against using 'default' breaks because they may not provide a good representation of distribution for a species. The default setting for set.breaks is Jenks simply because that's the default setting for breaks in ArcMap. I think it can be quite hit-or-miss whether Jenks or any other spatially-agnostic method of break selection provides a good representation of density distributions.
For all species, my general recommendation has been to manually specify breaks for each species and use the same breaks every year to facilitate comparison among years, although I understand that can be rather time consuming to do for every species. However, I suspect the breaks from the old tech memos may work reasonably well for quite a few species.
Unfortunately, I don't have the bandwidth right now to find and implement new manual bins for each species for this year's report. Worst case, we can add this to the to-do list for next year's reports. I appreciate that we have the old reports to look back on for these manual bins, but I suspect that those breaks also need review. If you currently have specific recommendations of what bins we should use for a taxon, can you add them here in the dist_bin
column? I'll incorporate manual bins if they're there and use jenks if they are not. I already added the new jellyfish bins to the spreadsheet.
Thanks for your help and insight on this!
Issue
tldr: IDW code works when specifying jellyfish cpue bins at
set.breaks <- c(0, 249, 683, 1365, 2794, 7200)
, but not atset.breaks <- c(0, 0.1, 2, 7, [max value])
.In the 2023 NBS Community Highlights document, we publish distribution plots of jellyfish using the
akgfmaps
R package:One of the issues regarding this plot I wasn't able to address before publication this year (but hope to in future years) was:
You also sent a helpful paper to reference useful breaks. I can't find the paper you referenced, but the breaks were something like
set.breaks <- c(0, 0.01, 0.1, 0.5, 1, 2, 3, 7, [max value])
.The Ask: It's a good idea. However, the script took too long (aka failed) to complete when I tried to apply this new break and I had to put this improvement on the backburner. Why is implementing this
set.break
'breaking' the script? I also tried using less breaks (e.g.,set.breaks <- c(0, 0.1, 2, 7, [max value])
) but that didn't seem to help. Here's an example of what I tried:Reproducible example
Set knowns and pull data
Create IDW stack and plot
Test 1: (auto-prepared jenks)
This code runs quickly and without issue.
EDIT: By working through this issue's reproducible example, I found a different error in my code (um, yay?) so now the plot naturally, using the auto-prepared
set.breaks
, makes much more sense and provides much better contrast (see figure). Needless to say, I'll be replacing the jellyfish figure that is in the community report. Regardless, I still have the same issue discussed above with implementing the paper-inspired breaks.Test 2: (bins specified from paper)
I ran this script over night (several times after restarting) and I can't get it to run past the below point. I don't know why it is getting hung up. Ideas?