dankelley / oce

R package for oceanographic processing
http://dankelley.github.io/oce/
GNU General Public License v3.0
144 stars 42 forks source link

Section Plot Questions #1331

Closed atcogswell closed 7 years ago

atcogswell commented 7 years ago

Short summary of problem

Would like to remove color ramps, change x axis to Month (J, F, M, etc...) and include all months regardless of time of year.

Details (optional)

The time series x axis is a little hard to manipulate when working with section plot and I can't seem to get the color ramps to disappear. this is one of the plots from the output below but I have 2 more for salinity and sigmaT. I'd like this plot to look more similar to the second set of plots that we put in our annual reports and are generated using Matlab code:

image

station2_2017

What you did

This is not really reproducible, but I don't think it is really necessary for the problems that I am requesting solutions for.

for (j in 1:length(anomnames)){

  par(mfrow=c(1,2))

  plot(mdata2sec, xtype='time', ztype="image", which=j,mar=c(3.5, 3.5, 4, 2.2), 
       ylim=c(148,0),showBottom=F,legend.loc="", xlab="", ylab="Depth (m)", legend="",xaxt="n")
  par(new=TRUE)
  plot(mdata2sec, xtype='time', ztype="contour", which=j, mar=c(3.5, 3.5, 4, 5), 
       showBottom=F, axes=F,legend.loc="", ylab="", xlab="", ylim=c(148,0), legend="", xaxt="n")

  plot(fsec, xtype='time', ztype="image", which=anomnames[j],
       mar=c(3.5, 3.5, 4, 2.2), ylim=c(148,0),showBottom=F,
       legend.loc="", zbreaks=brkseq[[j]], zcol=anomramp, xlab="", ylab="")
  par(new=TRUE)
  plot(fsec, xtype='time', ztype="contour", which=anomnames[j], 
       mar=c(3.5, 3.5, 4, 5), ylim=c(148,0),showBottom=F, 
       axes=F, contourLevels=brkseq[[j]], ylab="", xlab="", legend="")

  text(line2user(line=mean(par('mar')[c(2, 4)]), side=2), 
       line2user(line=2, side=3), paste(para[j], " Time Series and Anomaly Plots for ", sl,sep=""), xpd=NA, cex=2, font=2)

}'

What you expected to happen

What happened

How urgent is this?

Not super urgent. I don't think these are difficult problems, but I can't find a solution on line.

Output from sessionInfo()

R version 3.3.3 (2017-03-06) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1

locale: [1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252
[3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252

attached base packages: [1] tools stats graphics grDevices utils datasets [7] methods base

other attached packages: [1] bindrcpp_0.2 xlsx_0.5.7 xlsxjars_0.6.1
[4] rJava_0.9-9 lubridate_1.6.0 htmltools_0.3.6
[7] ocedata_0.1.3 htmlwidgets_0.9 mapview_2.1.4
[10] leaflet_1.1.0.9000 rgeos_0.3-25 maptools_0.9-2
[13] dismo_1.1-4 raster_2.5-8 rgdal_1.2-13
[16] dplyr_0.7.4 XLConnect_0.2-13 XLConnectJars_0.2-13 [19] sp_1.2-5 oce_0.9-22 gsw_1.0-5
[22] testthat_1.0.2

loaded via a namespace (and not attached): [1] sf_0.5-4 lattice_0.20-34 colorspace_1.3-2 [4] viridisLite_0.2.0 stats4_3.3.3 yaml_2.1.14
[7] base64enc_0.1-3 rlang_0.1.2 R.oo_1.21.0
[10] DBI_0.7 foreign_0.8-67 glue_1.1.1
[13] R.utils_2.5.0 foreach_1.4.3 bindr_0.1
[16] plyr_1.8.4 stringr_1.2.0 munsell_0.4.3
[19] gtable_0.2.0 R.methodsS3_1.7.1 codetools_0.2-15 [22] httpuv_1.3.5 crosstalk_1.0.0 gdalUtils_2.0.1.7 [25] markdown_0.8 Rcpp_0.12.13 udunits2_0.13
[28] xtable_1.8-2 scales_0.5.0 satellite_1.0.1
[31] jsonlite_1.5 webshot_0.4.2 mime_0.5
[34] ggplot2_2.2.1 png_0.1-7 digest_0.6.12
[37] stringi_1.1.5 shiny_1.0.5 grid_3.3.3
[40] magrittr_1.5 lazyeval_0.2.0 tibble_1.3.4
[43] crayon_1.3.4 pkgconfig_2.0.1 assertthat_0.2.0 [46] rstudioapi_0.7 iterators_1.0.8 R6_2.2.2
[49] units_0.4-6

richardsc commented 7 years ago

Hey @atcogswell !

I think this is likely one of those "issues" for which the solution is in your deepening your understanding of how to work with objects beyond the general methods provided by oce. In other words -- a learning experience! :)

Seriously, though, while the general plotting methods for objects are quite powerful, they are not meant to produce publication quality, fully customizable plots that suit everyone. In some ways, I think that the level of control that @dankelley and I have included in many of the methods has given the impression that they should be able to do everything -- if only we were more lazy from the beginning.

I wrote a blog post some time ago that will likely help to get you started:

http://clarkrichards.org/r/oce/section/ctd/2016/04/25/making-section-plots/

The idea is to extract the data from the section object (into matrices in this case), and use the imagep() function directly to get what you want. This is likely much more like how the specific MatlabTM plots were made. Then you can use things like the drawPalette=FALSE argument to suppress the colorbar, and possibly the tformat= arg to control the axis labelling.

A note -- if you want to plot all the months as in the ML plot, the default time axis methods probably won't do it naturally, and you'll have to do that yourself too. I recommend using axes=FALSE in the imagep() call, and then doing the axes yourself, using oce.axis.POSIXct().

richardsc commented 7 years ago

Hmmm ... looking at the various tformat options (see ?strptime), I don't see a format that gives only the first letter of the month. The closest is %b which is the 3 letter abbreviated month name.

If the single letter is important let me have a think about how to do that. @dankelley might have a simple idea.

Anyway, here's a crude example of an imagep() call something like what you might want:

library(oce)
data(adp)
d <- adp[['distance']]
# Fake a year long time vector
t <- seq(adp[['time']][1], adp[['time']][1]+86400*365, length.out=length(adp[['time']]))
mat <- adp[['v']][,,1]
imagep(t, d, mat, axes=FALSE)
axis(2)
oce.axis.POSIXct(1, tformat='%b') # note the `at=` arg if you need to use that for further customization
box()

image

richardsc commented 7 years ago

Ok, here's an idea. Basically I just create a new time vector from the original one that has values at the beginning of each month. Then I make a custom "label" vector by substr()ing the "%b" time format. Then I use the at= and labels= arguments in oce.axis.POSIXct() to set the labels and the positions manually:

library(oce)
data(adp)
d <- adp[['distance']]
# Fake a year long time vector
t <- seq(adp[['time']][1], adp[['time']][1]+86400*365, length.out=length(adp[['time']]))
mat <- adp[['v']][,,1]

tat <- unique(as.POSIXct(paste0(format(t, '%Y-%m-'), '01 00:00:00'), tz='UTC'))
tlabel <- substr(format(tat, '%b'), 1, 1)
imagep(t, d, mat, axes=FALSE)
axis(2)
oce.axis.POSIXct(1, at=tat, labels = tlabel)
box()

image

dankelley commented 7 years ago

Aesthetics

It looks as though @richardsc has beat me to the punch. I do this kind of thing all the time, using at and labels in an axis call. I wouldn't bother with oce.axis.POSIXct, though ... plain old axis will work perfectly well for this. I'd likely offer a simpler solution than @richardsc ... first, I'd construct a matrix, and second I'd make up an axis in months, and third I'd use image or contour or some other choice (maybe even imagep) to plot whatever I wanted, with axes=FALSE, and then I'd draw axes with at set to 1, 2, ... 12 or perhaps that minus 0.5 for centring, and so forth.

There is a real question (in my mind anyway) about how to label the time axis. Does the tick indicate the start of the month or the middle? Does the label go at the tick, or centred between ticks? Do we have uneven ticks (because months are not the same length)? These are all choices that make subtle differences, and are all left in the hands of the user, not through the addition of a lot more options to standardized (e.g. oce) functions. You can also use axis to just draw the ticks, and use text to draw labels (you need to set an option to let text draw outside the box). Or you could use mtext for the labels. You can basically place anything, anywhere. These things are all easy enough (and the coding makes it clear enough) that there's no point adding them to standardized functions. So, neither the oce functions nor the base R functions have enough arguments to handle such things.

As for the plots themselves, choices will need to be made about contour levels and even whether to contour. In noisy signals I prefer to use images, with no "levels" to the colour. In semi-noisy, I would do a smooth image with line contours on top. In a smooth case, I might even use contours alone. I would also hard-code almost everything, e.g. I would decide on the colours and contour levels for the whole set of plots, for uniformity. (Imagine how confusing bathymetric charts would be if every one used a different choice for depth contours to show.) I would also think carefully about visual impairment, on issues relating to colour choice, line types, fonts, etc.

This seems like an application where the extra time will pay off. I do this kind of fine-tuning on any important graph. It's worth spending an hour on something if 60 people can understand the result at a glance, instead of studying for 2 minutes. This argues particularly for uniform scales. If, for example, two temperature charts use different colourscales, the reader is going to have to do a lot of extra work to make sense of things.

Science.

The data are not sampled very frequently in time, and so a lot of the qualitative features are controlled by just single samples. For example, in an y=y(x) plot, a single high y value leads to a peak ... in an xy plot we are used to seeing such things but on an image, many people try to attribute meaning through the whole peak, because things are "filled in" with contour lines or image colour. If I were making a diagram like this, I would likely use image on the data, instead of creating a section. Of course, this would make the result look blocky, which some viewers would dislike. It depends on who this is for.

If it were me, trying to understand the data, I would also make timeseries plots e.g. T=T(t) for each of a series of depths. That can help with some thinking about mixing and convection, and it doesn't trick the eye as much. I'd also do T=T(t) for a depth integral, because heat content is important in understanding seasonal variations (I might even put a panel showing air-sea heat flux above the plot.) I know from research and from teaching that it helps a lot to try different plots. But if this is for a standardized plot for a report, I think you just need another half hour and you'll be all set. Stop thinking about oce and sections and such ... just use the basic R tools. That will force you to get your hands into the data, e.g. then you can do some hard thinking about how to define an anomaly. (Do we compare against means or medians? Do we compare just for this time series, or all time? How big are the averaging bins? Do we employ robust statistical methods or old-fashioned Fisher methods?)

atcogswell commented 7 years ago

Thank you both for your extensive, thoughtful and instructive comments. I think I will close this as it will take me some time (as I am planning a mission right now) to truly digest and implement this. Some really interesting philosophical points to ponder as I prepare these automated plots for the website.

Thank you again for your time,

Andrew