CLIMADA-project / climada_python

Python (3.8+) version of CLIMADA
GNU General Public License v3.0
324 stars 125 forks source link

Model Doc/ Validation of results #526

Open alexsynadino opened 2 years ago

alexsynadino commented 2 years ago

Hello,

I have been experimenting with Climada for some time in order to generate expected losses to properties in Belgium due to Floods. The idea being to compare the values outputted by Climada with the results obtained by using the current methodology proposed by the European central bank in its climate stress testing framework for banks (My Code is indicated below).

As a result I have some questions that I have not quite been able to answer by using the Climada documentation or other research. Those questions are listed below:

  1. where can we find the most complete documentation on each model? I noticed interesting results came from the Matsiro model but I don't know what features make this a more suitable model for my analysis or not?

  2. do I understand the modelling process correctly when I say that atmospheric data (climate drivers) contain certain variables that are the input to a certain ISIMIP model which then outputs (given RCP scenarios and timeperiod) a hasard forecast?

  3. I am working with Hazard data on floods in belgium queried with the client api (see code snipet below). From my understanding a hazard class combines ISIMIP models with climate drivers (atmospheric data) which allows me to simulate a certain number of meters of flooding for each year under analysis and each model. As a result, where could I access the actual simulated flood data? for example, how many meters occured at a given centroid when using the Matsiro model run with the hadgem2-es climate drivers for the year 2057?

  4. I believe that the resulting yearly damage is a result of the multiplication of the sum of damages across the different models and the frequency, correct?

  5. The frequency appears to be the result of the following calculation : 1/ (Nb years Nb isimip models Nb climate drivers), correct? --> in my case 1/20 years6 impact simulators 4 climate drivers = 1/480 = 0.0020833 --> my hazard query below outputs that exact value for frequency. The value is fixed which means that each year, each model and each climate driver gets equal weighting

  6. What prompted that equal weighting approach with the frequency?

  7. With the client api I am limited to 'res_arcsec': ['150'] (which if I am not wrong amounts to 20km²), does climada offer any way to go to a more granular level? Bear in mind i have no additional data at this stage than what is found in your database that I query via api.

I know this is a long list of questions. If it is more suitable for you I would be more than happy to discuss this over a call. I understand that you do not wish to act as a "consulting" entity in any way, but that is not what I am seeking. The design of my analysis is my own, yet I feel I am missing a little context on what is happening within the climada engine to truly be able to validate my approach.

Kind Regards, Alexandre

CODE: from climada.util.api_client import Client client = Client() import pandas as pd

INFO

dtf = pd.DataFrame(data_types) dtf.sort_values(['data_type_group', 'data_type']) tc_dataset_infos = client.list_dataset_infos(data_type='river_flood') client.get_property_values(tc_dataset_infos, known_property_values = {'country_name':'Belgium'})

Import Hazard

from climada.hazard import Hazard FL_Belgium = client.get_hazard('river_flood', properties={'res_arcsec': ['150'],'country_name': 'Belgium', 'climate_scenario': 'rcp85', 'year_range': ['2050_2070']}) FL_Belgium.plot_intensity(0)

FL_Belgium.centroids.plot()

print(FL_Belgium.size) print(FL_Belgium.event_name) print(FL_Belgium.frequency)

house values obtained for notary database

notaris = pd.read_excel("climada_flood_article.xlsx", usecols='A:H')

notaris_full=notaris notaris_full['latitude']=notaris_full['lat'] notaris_full['longitude']=notaris_full['long'] notaris_full['value']=notaris_full['price']

flood_exp= Exposures(notaris_full) flood_exp.set_geometry_points() flood_exp.gdf.head()

import impact function set for RiverFlood using JRC damage functions () for 6 regions

import numpy as np from climada.entity import ImpactFunc, ImpactFuncSet impf = ImpactFunc() impf.id = 1 impf.haz_type='RF' impf.intensity_unit = 'm' impf.name = "Flood Europe JRC Residential noPAA" impf.continent = 'Europe' impf.intensity = np.array([0.00, 0.5, 1., 1.5, 2., 3., 4., 5., 6., 12.])

impf.mdd = np.array([0.00, 0.25, 0.40, 0.50, 0.60, 0.75, 0.85, 0.95,1.00, 1.00])

impf.mdr_concave = np.array([0.000, 0.010, 0.020, 0.025, 0.030, 0.100, 0.150, 0.200,0.250,0.400,0.700, 1.000])

impf.mdr = np.array([0.000, 0.250, 0.400, 0.500, 0.600, 0.750, 0.850, 0.950, 1.000, 1.000]) impf.paa = np.ones(impf.intensity.size)

imp_fun_set = ImpactFuncSet() imp_fun_set.append(impf)

hazard_type= 'RF' haz_id=1 impf.tag ="RF"

Exposures: rename column and assign id - OTHER PTF

flood_exp.gdf.rename(columns={"FLBelgium": "impf" + hazard_type }, inplace=True) floodexp.gdf['impf' + hazard_type] = haz_id floodexp.gdf['impf'] = haz_id

SET impf

impf_FL_1 = imp_fun_set.get_func('RF', 1)

Impact on Belgian PTF

from climada.engine import Impact imp = Impact() imp.calc(flood_exp, imp_fun_set, FL_Belgium,save_mat=True) imp.plot_scatter_eai_exposure(ignore_zero=False, buffer=0.8)

chahank commented 2 years ago

Dear Alexander, thank you for your questions. Here a few quick answer:

First, it is important to understand that CLIMADA is a framework for climate risk assessment and adaptation option appraisal. The basic building blocks are the exposure, hazard, impact function (vulnerability) and measure. CLIMADA is not a model for said building blocks. However, a set of models that can be used as starting points are provided, and all are documented here. Each model also comes in general with a scientific publication.

To your specific questions:

  1. See the documentation here and here, and publication related to CLIMADA here.
  2. As said above, the specific model used for the hazard is not specified in CLIMADA and can be defined by the user. For further information please see the documentation. But yes, for certain hazard models (e.g. flood model from ISIMIP) it is as you described.
  3. Please see the ISIMIP documentation for the details about the flood model.
  4. For details on the CLIMADA computations please read carefully the paper here. I do not really understand your question, so I will let it be at this for the moment.
  5. The frequency is a property of each single hazard model. In the case of this flood model, the annual frequency is defined as you said.
  6. Please see this article
  7. You can resample your data to any resolution if you desire. You can also use any other flood model as input. The data on the API as of now is given only at 150as.

I am sorry that most answers are only referrals to other sources. However, I feel it is needed that these sources are consulted before further discussions, as these describe in all detail the flood model of interest.

alexsynadino commented 2 years ago

Dear Mr Kropf,

Apologies for the late reply I was on vacation. Thank you very much for those extra specifications and links, they are going to be useful for the following steps of my analysis.

Regarding my Question 4 I was referring to the method of calculation of the EAI but the article you provided :"CLIMADA v1: a global weather and climate risk assessment platform" explains the formulas used in the computation!

thank you again, Kind Regards, Alexandre

alexsynadino commented 1 year ago

Hello,

If I may I would like to reopen this feed to obtain final elements of information. Firstly I should state that with my team we have further developed our usage and our understanding of climada, and are working on practical implementations in credit risk departments of banks, so thank you for this wonderful tool.

Secondly, I return to a remark that I had written above about the frequency variable that is contained in the hazard dataset: FL_Belgium = client.get_hazard('river_flood', properties={'res_arcsec': ['150'],'country_name': 'Belgium', 'climate_scenario': 'rcp85', 'year_range': ['2050_2070']})

tag: fldfrc_150arcsec_watergap2_miroc5_flopros_rcp85

I have read the general climada methodology where the engine functionalities are described as well as other documents. Yet I would have a question on the interpretation of the frequency that is measured in my case.

Computing frequency: So from the underlying data I have figured that the frequency provided in the hazard data is actually a function of the impact models, atmospheric datasets (climate drivers) and the number of years under consideration. So:

1/ (Nb years Nb models Nb climate drivers) --> 1/(20 years6 impact simulators 4 climate drivers) = 1/480 = 0.0020833 frequency.

Interpreting the frequency: My questions would be the following: • Can I read this frequency to be equivalent to a 480 return period? In other words 1/480 probability every year? • is that frequency equivalent to an expected value? Most flood maps I find display data on set return periods (50, 100 , 200 , 500). The frequency observed here seems less dependent upon the probability of a certain flood-intensity from occurring, it is more of an average of many simulations. If you can point me to any documentation on the logic behind the design for that frequency it would be welcome! • Lastly, one interpretation I would have for the approach is that each model outputs 1 hazard (so 6 per year and per climate driver), that are at varying levels in terms of intensity (eg: Matsiro tends to output 1/500 flood intensities compared to CLM45 that computes 1/100) and each modelled future hazard has equal weight. So hazard intensity 1(1/480) + hazard intensity 2 (1/480) and so on... At this point, I should understand my average future damage as my expected value for that RCP scenario and that time-period. Maybe you could confirm if that is correct?

I am sorry to press on with this, yet clear understanding of how the probability to see damages is defined is quite important if we are to measure potential impacts on specific portfolios of banking exposures.

thank you very much, Best Regards, Alexandre