softwareunderground / 52things

52 Things You Should Know About Geocomputing
102 stars 61 forks source link

Some advice on reproducing figures - review wanted #67

Closed mycarta closed 4 years ago

ffigura commented 4 years ago

Some advice on reproducing figures

This work stems from my experience reproducing the Figure 1 (shown below) from Froner et al. (2013), using python, and from my desire to give some advice on the reproducibility process. In the following I'll list the steps to reproduce figures from publications.

Figure 1 link: https://github.com/softwareunderground/52things/blob/master/figures/Niccoli_3_Figure1.png

Step 1. Get permission. Before embarking into a project, if you think you will need the original data, ask for permission to use it, to get the lay of the land. If that is not possible, and if you cannot make up similar data, this may not be the right project for you. Conversely, if you can work with made-up or modeled data, ask for permission to show the original figure alongside the reproduced one. I have requested permission to reuse figures many times, both from publishers and from professional societies, and have never received a negative response. For this chapter, I got a response from the European Association of Geoscientists and Engineers (EAGE) in less than 24 hours, granting permission to show the original figure both in the notebook and in the book. The Society of Exploration Geophysicists (SEG) follows the permission guidelines from the International Association of Scientific, Technical, and Medical Publishers (STM), granting republication rights for up to 3 figures (and other contents) without the need for explicit permission (see at: https://seg.org/Publications/Policies-and-Permissions/Permissions).

Step 2. Figure out the math. Define, study and test the equations, functions, and other computations necessary to reproduce the figure. Often I start with an online graph tool like Desmos (at: https://www.desmos.com/calculator). I study each parameter, first in the specific ranges required to reproduce the figure and subsequently with large ranges of values, to understand their impact; when necessary, I try to come up with sensible default values. Thereafter, I replicate the whole process in a Jupyter notebook with interactive widgets, for this case see: Interactive parameter exploration exponential function, at: https://github.com/mycarta/Reproducing-exponential-grayscale-cmap/blob/master/Interactive_parameter_exploration_exponential_function.ipynb

Step 3. Get the plots right. Figure out the specific plotting stuff to replicate the figure and the best library or libraries to get it done. In this work, I used the python library matplotlib to plot this figure. To get it right, apart from plotting I recognized I needed, among other things, to:

The first two items I’d already figured out for my Geophysical tutorial How to evaluate and compare colormaps, available at: https://github.com/seg/tutorials-2014/blob/master/1408_Evaluate_and_compare_colormaps/How_to_evaluate_and_compare_colormaps.ipynb. For the last item, the function numpy.searchsorted provided the solution. The figure below shows the result.

Figure 2 link: https://github.com/softwareunderground/52things/blob/master/figures/Niccoli_3_Figure2.png

You can work through the entire process in the notebook How to make exponential grayscale, https://github.com/mycarta/Reproducing-exponential-grayscale-cmap/blob/master/How_to_make_exponetial_grayscale.ipynb

Step 4. Generate the results. Apply on the data that come with the paper, if you got permission, or other data, either made up or real. In this case I tested the colormap on a time slice from the Netherlands F3 open seismic dataset (available at https://terranubis.com/datainfo/Netherlands-Offshore-F3-Block-Complete).

Step 5. Do some extra work. Go further, improve the plots or add interactivity so that others can experiment too. For example: to improve on the original figure I made the Lightness plot color change in parallel with the colormap, rather than being just one color. Also, I made a separate interactive notebook, where the color bar of the plot, the Lightness profile plot, and the seismic time slice plot are updated in real time. Thus, the users can experiment with the parameters and make their own colormap.

Step 6. Pay it forward. Share your results. More than that, do it with a permissive license, ideally a Creative Commons Attribution (CC-BY). In my case, I am clearly identifying the original figure as copyrighted material, reused with permission, and my figure and the notebooks as CC-BY. The CC-BY license allows one to share and adapt, so long as appropriate credit is given (see at: https://creativecommons.org/licenses/by/3.0/us/).

In the end, have fun with it. For me, it was a wonderful and empowering experience. To quote from Matt Hall’s slides (https://agilescientific.com/blog/2011/3/25/geo-floss.html) “...now I can read an article in Geophysics, or The Leading Edge, and - assuming the authors reveal their methods openly - I can try them out immediately on my own data. I can improvise, tweak and improve. This is powerful: Now I can test ideas on the fly… I am free!”

Reference Froner, B., Purves, S., Lowell, J., & Henderson, J. (2013). Perception of visual information: The role of colour in seismic interpretation. First Break, 31(4), 29-34. doi: 10.3997/1365-2397.2013010.

ffigura commented 4 years ago

Hi @mycarta , I reviewed the text and the notebook. They were good, I just tried to clarify some points, I hope it helps. I tried a commit in the notebook but I'm not sure if I did it right. You can catch me on slack if something is not correct.

mycarta commented 4 years ago

@ffigura thanks a million. I will review both this evening!

mycarta commented 4 years ago

@ffigura after reading your reviews above, I implemented several of your recommendations, further refined a couple of points (combining your suggestions with my further thoughts), and only dropped a couple of things. Thanks a lot, your effort is really appreciated. I will now close this issue and tackle your suggestions for the notebook as that's separate from the book.

ffigura commented 4 years ago

@mycarta this is great! I'm glad that I could help, I read the final manuscript and is good. I learned a couple of things about licensing and sliding plot, thank you.