sslarch / caa2020_hackathon

Repostitory for a CAA2020 session
6 stars 0 forks source link

Tool to display chronological intervals using R #2

Open dirkseidensticker opened 4 years ago

dirkseidensticker commented 4 years ago

Within the currently available geom's ggplot() is offering, a proper illustration of (dating) intervals is hardly possible. Usually geom_rect() or geom_segment() are used to display data -- e.g. chronological phases -- with a given start and end date.

library(ggplot2)

phase <- c("A", "B", "X")
from <- c(1200, -400, 250)
to <- c(1800, 200, 600)

chrono <- data.frame(phase, from, to)

ggplot(data = chrono, 
       aes(x = to, 
           xend = from, 
           y = phase, 
           yend = phase, 
           label = phase)) + 
  geom_segment(size = 6, alpha = .4) + 
  geom_text(hjust = -2) + 
  theme_minimal()

Rplot

Within last years presentation during the EAA (Barcelona; see RDF based modeling of relative and absolute chronological data ) we were unable to propperly display intervals using R/ggplot.

Within that same framework I would suggest trying to implement Allen's Interval Algebra (doi:10.1016/B978-1-4832-1447-4.50033-X) within an R package. While it seems to be interesting, especially within archaeological research, it seems to be hardly known by anybody as there are no tools available to model and display these relationships effectively. And as chronology is at the basis of most archaeological reasoning I would propose it as a project for the unconference.

nevrome commented 4 years ago

Great, @dirkseidensticker! I think this is a nice project idea for the unconference setting. A geom would be a perfect solution for this purpose in my opinion and I could very well imagine an R package just to solve this plotting challenge in a good and way.

Beyond the plotting I would be very interested to understand the relation of Interval Algebra and the Aoristic method as it is implemented in our aoristAAR package. I guess I have to take a look at the paper.

MartinHinz commented 4 years ago

Very cool, super idea! Can be implemented as a graphical representation certainly in feasible time, and offers possibilities for flexible extension, plus is a problem area that exists almost universally for all archaeological applications.

Currently my absolute favorite for a hackathlon task!

SCSchmidt commented 4 years ago

It is a great idea! In Wilmershaven Chiara talked about ChronochRt, an R-package developed to build chronology tables: http://ag-caa.de/workshop2019/programm/ It's still under development, but maybe you can ask her or Thomas Rose whether they want to join you?

nevrome commented 4 years ago

Wow - that sounds super interesting, @SCSchmidt. I just pinged Chiara (does she have an account on github?). Let's see if she wants to join the discussion.

archaeothommy commented 4 years ago

Chiara forwarded me @nevrome's ping. We are developing "just" a plotting tool for chronological charts with some enhancements. From the description and the linked talk of @dirkseidensticker I had the impression this goes in a somewhat different direction than the suggested project for the hackathon?

Please find below the output of our test data set and a screenshot of an input file (manual input directly in R will also be available). Each column is separately calculated and displayed (geom_facet). Within each column two chronological systems (e. g. early and late chronologies) are supported. More would be easily possible technically, but for the sake of comprehensibility we think in this case one should use seaprate columns. Of course it can handle subunits and I hope our algorithm is already capable to calculate them regardless of the number of subunits and complexities in the chronology. At the moment we are finalising the labeling. In its basic version is just needs the tidyverse. For image labels and data import additional packages are needed (ggimage, readxl). The full presentation of the Wilhelmshaven-CAA should be available soon.

We would be very interested in collaboration. As said it seems that the aim of the suggested project goes in a different direction. I can well imagine that both will complement each other nicely.

Output_example Input_example

nevrome commented 4 years ago

Thank you for the fast reply, @archaeothommy! That's excellent work!

As I already discussed this project idea with @dirkseidensticker in the past, I'm fairly confident, that plotting for sure is a major part of this project idea. Your work partly anticipates what we had in mind - which is a wonderful thing!

IMHO the best solution to implement an own plotting geometry in R is by extending ggplot2. Is this code already structured as an own geom_ @archaeothommy? If not: Do you think this could be a good approach for version 2.0 of your package?

I'm dreaming of an R package like ggalt, ggExtra or ggthemes specifically for archaeology that adds valuable geoms, scales and themes relevant for our subject. A geom_profile() comes to mind to plot profile drawings. Maybe you have other ideas?

dirkseidensticker commented 4 years ago

@archaeothommy hey, indeed a really, really great package you have there. Indeed the latter part of my initial comment was going a bit further to my personal moonshot 🌕🚀

But the visualization part, without the need to fiddle around with geom_rect() or geom_segment() as I used to do is a very vital step. Thus it is cool to see what you are up to. Are you considering ranges of the starting of end point? Sometimes, if a a clear start of end date can not be given, a diagonal line, instead of a horizontal one, is drawn (see https://www.researchgate.net/figure/Relative-chronological-system-for-Hungarian-Early-and-Middle-Bronze-Age-after-David_fig3_307934616)

We thought that outlining/building a ggplot2-extension would be feasible for the hackathon.

And concerning Allen's Interval Algebra: I think it could solve some issues I have with the (largely non existent) chronologies schema's for Central Africa, where connections between pottery styles are sometimes articulated (aka "style A must have started while B was still around" and much like that). For the EAA presentation we relied on Correspondance Analysis and the Aligator method developed by @florianthiery Nonetheless I guess in which way (manual or some sort of still to be developed code relying on Allen) one produces the input (name, from, to) depends on the use case as well. An R-package that can help here would be already something.

c-girotto commented 4 years ago

We thought that - aside from the programming aspect - a diagonal line implies a lot of things, which might not necessarily be intended. To me, a diagonal line is more like "in some parts it starts this early, whereas in others it does not" and not a "it starts somewhere between then and then". Furthermore, if regions are plotted that diagonal line might even be indicating North-South gradients etc. rather dating uncertainty We thought of implementing a dashed line, to indicate the region of uncertainty might be a good way to display this "period overlap".

Originally we thought that a 2.0 version could be a shiny app, useable for those, who do not possess any R programming skills. We thought about how it could be visualised by using geoms but did not really get anywhere, so it sure would be nice to have this as another option!

SCSchmidt commented 4 years ago

I agree with your feeling on a diagonal line, @cgirotto. Maybe two dotted lines "in between here the change happens" might be a way to sow the uncertainty? Or did you mean this already, not just one line as I understood?

I'm not sure, whether I'll be in Oxford, but I'd love a package that'll create visualisations of chronlogies! So thumbs up for this project at the hackaton! ;-)

archaeothommy commented 4 years ago

@SCSchmidt : that's exactly the plan (two dotted lines)

"but did not really get anywhere" Well, to be honest, at least I have no clue how to make an own geom and had not the time yet to make myself acqainted with. But I am open to it and interested, also in the idea of a "archaeological graphics" package @nevrome (and no, no spontaneous ideas about any geoms). As @cgirotto said, it will be a good addition/enhancement. I hope to join the CAA and I would be happy to contribute in the hackathon for a geom/ggplot-extension. However, I have to wait for the schedule of the mandatory training events in our ITN before I can make a definite decision.