pangeo-data / pangeo

Pangeo website + discussion of general issues related to the project.
http://pangeo.io
705 stars 189 forks source link

Earth System Model of Intermediate Complexity for Interactive Display #696

Closed pbranson closed 5 years ago

pbranson commented 5 years ago

Hello Pangeo,

First up, sorry for the somewhat off-topic post, but I thought this community would have some valuable suggestions.

To assist in educational outreach on the implications of climate change on the earth, do you think it is practical to have a lower complexity earth system model that is able to run in 'interactive' time, in that users can alter the model schematisation and watch the ocean/atmosphere circulation change in response?

I have seen such interactive displays used for coastal engineering education, where you can 'draw' coastal structures and watch how the circulation and wave propagation changes.

Technically this seems feasible to run in the cloud with data streamed to some carefully designed holoviews/geoviews displays.

I'm not a climate scientist but some of the processes that might be interesting and informative might be:

  1. The effect of loss of polar albedo on oceanic and atmospheric circulation. People could 'paint' and erase glaciations.
  2. The effect of land use changes - i.e painting on rainforest and coastal marshes
  3. The effect of emissions scenarios on global agricultural production
  4. The effects of glacial ice loss on sea levels

Other suggestions?

From memory about human computer interaction a 'response' time of 200ms is generally acceptable, which implies a framerate for display around 5hz. I don't have a good feel for how long average users would be satisfied in seeing an outcome, but maybe 2 minutes is reasonable to stimulate 200? years. So the model would need to be able to simulate one year per second.

Do you think there is a model complexity that can resolve the global dynamics on a small cluster/GPUs at that speed?

Are there any pre-existing open-source codes that might be worth looking into or groups attempting to do this?

Is this a stupid idea?

Sincerely and thanks, Paul

RPrudden commented 5 years ago

The closest project I know of is climt, although I don't think it's as fast as you are describing.

As far as I know, climt isn't currently able to use GPU acceleration or scale across a distributed cluster. But it does scale across multiple cores, so that would be one way to increase the speed.

@JoyMonteiro or @mcgibbon will probably have some more insight!

JoyMonteiro commented 5 years ago

Hello Paul,

I know that there are models (Max Planck's coarse resolution GCM for example) capable of the kind of simulation times you are describing, but at the end the issue is one of resolution. We have always used moderate/high vertical resolution in climt simulations, which makes them slow.

Thanks to its Python interface, climt can make life easy in some aspects for you (interactivity, installation on the cloud, tight coupling with scientific python libraries).

However as @RPrudden says we have never attempted to get it to run at the speeds you require. I must note that this is mainly due to a lack of trying -- all our use cases have prioritized fidelity (which requires moderate/high resolution).

The second issue is climt is aimed at atmosphere-only simulations (again because of our own interests). There is an ocean model in pure python, which is also GPU accelerated. We have plans to write an interface between climt and veros, but there is nothing in place yet.

climt should be able to handle scenarios 2, 3 and the atmosphere part of 1 that you have talked about.

To answer your questions, this is very doable but will require some playing around and possibly some development work. I'm personally quite interested to see climt being used in educational projects like the one you are describing.

If you are interested to fiddle around with the model resolution and see if you can achieve anything like the speeds you need, I would be happy to provide my inputs and help in optimising the model.

Feel free to raise an issue at the climt repository if you are interested!

dionhaefner commented 5 years ago

I am the author of Veros, the ocean model @JoyMonteiro mentioned.

I think this is a beautiful idea that many people would love to see realized. However, this

So the model would need to be able to simulate one year per second.

is, to my limited knowledge, impossible in this day and age.

The closest we can get to this with Veros (or its faster Fortran parent project, PyOM) is a global 4x4 degree setup distributed via MPI. Even in this low resolution setup, we only get about 1 model year per wall minute. Say you somehow get this down to 10s through some aggressive optimizations - that would still be an order of magnitude off, and that's just the ocean part!

The inherent problem is that such coarse problems do not profit too much from parallelization either. GPUs or a cluster won't help you if your array dimensions are just 100x50x20 or so. And you can't simplify the model too much anyway, because then the response will be nothing like the one we would see in the real world.

The only way I see this happening is by building a specialized model for each of the subtasks. E.g., use the knowledge we have about how the climate system reacts to a local change in albedo to build a model that simulates just that (and ensure the simulation stays within reasonable bounds).

Something like this could possibly consist of a neural network that emulates the response of a full climate model, by using lots of model output as training data to create something that feels like it has the same characteristics (this is somewhat similar to what game engines do when simulating fluid dynamics in real time - build an approximation that looks close to the real thing). This is already being done for sub-grid parameterizations.

rsignell-usgs commented 5 years ago

There is also this new climate modeling effort in Julia: http://paocweb.mit.edu/about/paoc-spotlights/new-climate-model-to-be-built-from-the-ground-up https://github.com/climate-machine/Oceananigans.jl https://www.youtube.com/watch?v=kpUrxnKKMjI

JoyMonteiro commented 5 years ago

I asked around, and the MPI coarse resolution model does 200 years per day. So, I agree with @dionhaefner about the fact that you will need to "parameterize away" some part of the climate system to get the kind of speeds you are looking for.

@rsignell-usgs is this project the same as the Vulcan Inc. modelling project?

rsignell-usgs commented 5 years ago

@JoyMonteiro I don't know. Vulcan is not mentioned in the news release:

Each of the partner institutions brings a different strength and research expertise to the project. At Caltech, Schneider and Stuart will focus on creating the data-assimilation and machine-learning algorithms, as well as models for clouds, turbulence, and other atmospheric features. At MIT, Ferrari and John Marshall, also a Cecil and Ida Green Professor of Oceanography, will lead a team that will model the ocean, including its large-scale circulation and turbulent mixing. At NPS, Giraldo, will lead the development of the computational core of the new atmosphere model in collaboration with Jeremy Kozdon and Lucas Wilcox. At JPL, a group of scientists will collaborate with the team at Caltech's campus to develop process models for the atmosphere, biosphere, and cryosphere.

Funding for this project is provided by the generosity of Eric and Wendy Schmidt (by recommendation of the Schmidt Futures program), Mission Control for Earth, Paul G. Allen Philanthropies, Caltech trustee Charles Trimble, and the National Science Foundation.

@rabernat likely knows more, but I imagine he's pretty busy with the Pangeo Community Meeting for the next few days.

darothen commented 5 years ago

@JoyMonteiro and @rsignell-usgs the MIT work you mention is part of CliMA (Climate Modeling Alliance) spearheaded by Tapio Schneider at CalTech; I'm sure @nbren12 could share more information on Vulcan!

nbren12 commented 5 years ago

Couple of things. PlaSim is one nice model that we used in grad-school classes on climate change. It's in Fortran, but easy to compile and runs very fast. Generating 200 years in 2 minutes is a tough requirement though. Maybe a single-column model would be better for this.

The Vulcan project is separate from CliMA, which is a partnership between several universities funded by several donors. On the other hand, we are an in-house effort.

pbranson commented 5 years ago

Thanks everyone for their responses. It was 'National Science Week' here in Australia last week so I was thinking about something that can be used for general public outreach and education on a stall at a fair.

The inherent problem is that such coarse problems do not profit too much from parallelization either. GPUs or a cluster won't help you if your array dimensions are just 100x50x20 or so.

Seems that a low resolution 3D coupled model is out of the question from an interactive sense, unless perhaps emulator models can be established for some carefully selected partitioning of the system which would require quite some effort and skill beyond what I have.

Maybe a single-column model would be better for this.

Perhaps this is a better approach in any case, to demonstrate the hierarchical approach that is applied in building model complexity.

So perhaps I should rephrase the question to be "What set simplified models would be most effective for an interactive display in communicating how climate models are constructed and predictions established?"

Possible ideas:

  1. An interactive component might be to have a physical model of the 1D heat equation (literally a metal rod heated at one end by a computer controlled thermostat). People could take observations along the length with some thermo-couples and compare to numerically simulated results. An ensemble of numerical simulations could be ran with differing initial conditions and parameterisations for radiative and convective heat loss to demonstrate how uncertainty is accounted in analogy for climate model ensembles.
  2. A 0D 'box' model of the planet to demonstrate the variation due to insolation, green house effect and changes in albedo on average temperature.
  3. A 1D atmospheric column model to demonstrate the influence of cloud processes as shown in this paper: https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2018MS001578
  4. Some precomputed 3D models with interactive displays for analysis of results

Ideally with a setup as simple as clicking a 'Launch Binder' like has been achieved by this group (aside from (1)!)

darothen commented 5 years ago

@pbranson the emulation idea is very, very interesting. In fact there may be a "middle ground" that hasn't been mentioned here yet - running a coarse simulation and then downscaling the resulting output. But as others have mentioned even a coarse simulation could be too costly for running online as you described. Other creative emulation techniques, such as using GANs to emulate synoptic-scale variability in the context of a climate simulation, are interesting but likely very complex to get off the ground.

Among the other ideas that you mention, simpler models would make a great exploratory tool. The companion website for David Archer's "Global Warming: Understanding the Forecast" has many examples of such interactive tools. Starting with the simplest, a 0D model which includes a simplified n-level atmosphere of the Earth and can take into account things like the distance to the sun (possibly time-varying to account for orbital variation), atmospheric opacity (to simulate the addition of GHGs to the atmosphere) would be pretty easy to whip together into an interactive tool using Bokeh and makes for a nice hands-on model for understanding what goes into setting the Earth's thermostat.

When we ran a MOOC on climate change at MIT, we included an interactive 1D column model similar to SCAM6 in your linked paper, but it was a much simpler tool developed and used by Kerry Emanuel's group for research. It's in Fortran but for the MOOC, we built an online interface that allowed students to run it and get results. It's quite fast... takes seconds to do very long integrations. If I recall correctly, some of his recent students (maybe @tbeucler [http://tbeucler.scripts.mit.edu/tbeucler/]) used this tool in their research... it's possible an updated version exists? It would make a fantastic piece of code to port to Numba-optimized Python or Julia, and then it would very nicely fit into your idea of a Binder-like tool for people to play with.

JoyMonteiro commented 5 years ago

@pbranson Those ideas are quite interesting! I find the first one especially fun :)

I had developed a series of models for a class I taught in the spring, beginning from a 0-D model, moving to radiative equilibrium, then radiative-convective equilibrium and a simple 3D GCM. You can take a look at the material and see if you can use some of it for your own purposes https://github.com/JoyMonteiro/model_tour_climate/tree/master/notebooks

To demonstrate the greenhouse effect, I found that it was really useful to show students

  1. what the radiating level is using an actual radiative model -- add CO2, water vapour, clouds and see how the radiating level changes.
  2. That the vertical temperature profile can be set in a variety of ways (radiation, convection, dynamics)
  3. The greenhouse effect at the surface is then determined by the radiating level and the lapse rate.

I must also mention https://github.com/atmtools/konrad developed by @lkluft and @SallyDa and its companion website https://konrad-climate-model.herokuapp.com/experiments which I find really fun!

mcgibbon commented 5 years ago

@JoyMonteiro FYI I will be working at Vulcan, and am very keen to bring ideas from Sympl into that model.

@pbranson you may want to look at some of the ideas that the game Universe Sandbox 2 did with including planetary climate. In order to get things running at interactive speeds, they needed massive simplifications like using a zonally symmetric model, and then adding artificial random zonal asymmetry after the fact. For what you're suggesting, it would be much more effective to develop a simplified diagnostic equilibrium model or energy balance model rather than use a prognostic model.

mcgibbon commented 5 years ago

You also have to think a little hard about what it is you want to visualize and how to visualize it. For Universe Sandbox, it was enough to focus mainly on planets freezing over or burning all their water off. If you want to show something like small changes in rainfall or a couple degrees of warming, that can be harder to show in a visually appealing way. Everyone understands ice expanding or melting and oceans expanding or contracting, but colormaps are harder for a non-scientific audience to appreciate.

RPrudden commented 5 years ago

I'm enjoying the discussion here. As has already been mentioned by @dionhaefner @pbranson and @darothen, in the longer term it would be very interesting to experiment with combinations of component emulation, downscaling and perhaps also coarse-graining to reach high speeds while retaining some degree of realism.

pbranson commented 5 years ago

I have also enjoyed the discussion. Thanks everyone for all the excellent links and suggestions.

I will think a little more about it and start a markdown to synthesise the information provided in this thread. The goal I think is to setup a repo2binder which would be great for making this accessible.

I will give some more thought about the key messages and example models. Certainly as @mcgibbon and @JoyMonteiro suggest, from a interactive perspective, diagnostic models are the currently the only option.

JoyMonteiro commented 5 years ago

I would be interested to read any such document!

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 5 years ago

This issue has been automatically closed because it had not seen recent activity. The issue can always be reopened at a later date.