abpoll / unsafe

A package for adding parametric uncertainty to the national structure inventory and estimating flood losses with uncertain depth-damage relationships
BSD 2-Clause "Simplified" License
5 stars 1 forks source link

Smaller chunks for notebooks #4

Closed jdossgollin closed 4 months ago

jdossgollin commented 4 months ago

Some of the notebook have very long chunks, which means the user is figuring out a bunch of code at a time. Additionally, lots of information is given as code comments when Jupyter's Markdown might be more appropriate. For a cherry-picked example:

# Get the files we need downloaded
# These are specified in the "download" key 
# in the config file
# We transpose because one of the utils
# needs to return a list of the output files
DOWNLOAD = pd.json_normalize(CONFIG['download'], sep='_').T

# Wildcards for urls. For example, {FIPS} is an element in this
# list because in the urls supplied in the download dictionary, 
# this is what we are going to replace with the FIPS code
# for our analysis. We don't assume that the wildcard is the
# only text in between brackets in a url (it happens to be the
# case for this case study) so we think it's useful to have
# this list pre-configured. 
URL_WILDCARDS = CONFIG['url_wildcards']

# Get the file extensions for api endpoints
# In our case study, this is only for downloading from the NSI
API_EXT = CONFIG['api_ext']

# The data from the NSI is in .json format
# and we couldn't find the coordinate reference system
# through GET requests (we may have missed how). However,
# we were able to find metadata online that indicates 
# the CRS. 
NSI_CRS = CONFIG['nsi_crs']

# Dictionary of ref_names
# When we download tracts, block groups, etc. 
# from the TIGER endpoints, they can sometimes have a lot of
# characters. We find it helpful to standardize the
# names (i.e. block instead of tabblock20), but this
# can be customized to your preferences. 
REF_NAMES_DICT = CONFIG['ref_names']

# Dictionary of ref_id_names
# We will run the same processes on all the reference data
# like reprojecting and clipping to our study boundaries.
# We also will merge tabular data that is designed at
# certain administrative boundaries with attribute
# joins. This dictionary converts the names like GEOID
# to tract_id. 
REF_ID_NAMES_DICT = CONFIG['ref_id_names']

# Coefficient of variation
# for structure values
# This is what we scale the structure value by
# to get the standard deviation we draw from
COEF_VARIATION = CONFIG['coef_var']

# First floor elevation dictionary
# This maps foundation types to the triangular distributions
# for first-floor elevation
FFE_DICT = CONFIG['ffe_dict']

# Number of states of the world
# This is the number of ensemble members
N_SOW = CONFIG['sows']

# The hazard configuration will vary on a case study by
# case study basis. There may be different configuration
# parameters that you need to specify depending on how
# the hazard data you're using is structured. We hope
# to provide techniques to systematically accommodate 
# certain inundation model outputs as we gain more experience 
# coupling UNSAFE with different models
# The below structure can work on any depth grids from the FEMA
# Flood Risk Database. We specify this case study
# to focus on the riverine flooding products and for the
# 500, 100, 50, and 10 year return periods. 

# Get hazard model variables
# Get Return Period list
RET_PERS = CONFIG['RPs']
HAZ_FILEN = CONFIG['haz_filename']
# Get CRS for depth grids
HAZ_CRS = CONFIG['haz_crs']

This isn't wrong, per se, but breaking this into a few chunks of alternating Markdown and code would increase readability.

abpoll commented 4 months ago

Thanks! I tried to improve readability with this commit