Closed babakkhavari closed 1 year ago
This is also happening for Biogas: when we do not have available biogas (i.e. no livestock, no water or temperature too low) we are getting inf values in the time_of_collection
. We will need to exclude biogas from the baseline for those cells and adjust the shares of the others.
Problem here is: We will be changing the shares of the different fuels to not match the original data. What has been done now wont work if you call them after eachother? Lets say you adjust electricity in certain cells using the first function and then in the second function you adjust biogas in certain cells. How do make sure that the cells do not coincide (unelectrified cell without biogas potential)? And if we can not make sure that the cells do not coincide the second function will always render the first function void?
If I am right, we can solve this fairly easily by adding three small functions on top of what @aliciaoberholzer already has done:
Related to function 2 and 3, you probably want to produce a warning to the user "Total population that can cooking with biogas based on GIS data is lower than what you have entered in the tech specs. The new share is X%. If not OK, please adjust the GIS data." or similar
The function 1-3 are national and not a cell basis. Then comes the stuff Alicia has already done (I do however think that they need to be done together at the same time).
Instead of adjusting everything to biomass, adjust to a weighted average of the dirty ones :)
This excel implements a basic example and solution to this problem, it is the basis of the methods that we are using in the package functions:
For rural areas (we assume that there is no biogas in urban areas):
We are likely doing something incorrectly in the step where we allocate remaining population to technologies other than electricity and biogas:
remaining_share = 0
for name, tech in tech_dict.items():
if (name != "Biogas") & (name != "Electricity"):
remaining_share += tech.current_share_rural
remaining_pop = self.gdf["Calibrated_pop"] - (tech_dict["Biogas"].pop_sqkm + tech_dict["Electricity"].pop_sqkm)
for name, tech in tech_dict.items():
if (name != "Biogas") & (name != "Electricity"):
tech.pop_sqkm = remaining_pop * tech.current_share_rural / remaining_share
I do not think this alone will solve the issue, but one of the issues we have is the calibration of urban and rural population I think. I did some tests. And the rural population does not seem to add up?
Which in turn creates a mismatch of 3 million between:
nepal.techs["Collected_Traditional_Biomass"].pop_sqkm.loc[~isurban].sum() + nepal.techs["LPG"].pop_sqkm.loc[~isurban].sum() + nepal.techs["Biogas"].pop_sqkm.loc[~isurban].sum()
and the sum of population_cooking_rural
So, updating the urban and population calibration in onstove.py
seems to fix the issues in rural cells. The new population calibration now looks like:
def calibrate_current_pop(self):
isurban = self.gdf["IsUrban"] > 20
total_rural_pop = self.gdf.loc[~isurban, "Pop"].sum()
total_urban_pop = self.gdf["Pop"].sum() - total_rural_pop
calibration_factor_u = (self.specs["Population_start_year"] * self.specs["Urban_start"])/total_urban_pop
calibration_factor_r = (self.specs["Population_start_year"] * (1-self.specs["Urban_start"]))/total_rural_pop
self.gdf["Calibrated_pop"] = 0
self.gdf["Calibrated_pop"].loc[~isurban] = self.gdf["Pop"] * calibration_factor_r
self.gdf["Calibrated_pop"].loc[isurban] = self.gdf["Pop"] * calibration_factor_u
and the urban rural calibration looks like:
def calibrate_urban_current_and_future_GHS(self, GHS_path):
self.raster_to_dataframe(GHS_path, name="IsUrban", method='sample')
self.calibrate_current_pop()
We still have problems with urban areas though. They all have values of NaN
I think the reason is the last for-loop:
for name, tech in tech_dict.items():
if (name != "Biogas") & (name != "Electricity"):
tech.pop_sqkm.loc[isurban] = remaining_urbpop * tech.current_share_urban / remaining_urbshare
tech.pop_sqkm = tech.pop_sqkm / self.gdf["Calibrated_pop"]
.loc
does not update series values in place it brings them out only I think. So, the line after the if creates a series of ~3k lines (the same number of lines as urban settlements), but leaves the original series untouched. So the last line then divides the rural areas correctly and for urban areas it only divides NaN-values by the calibrated pop.
This last issue can be fixed with small updates in the techshare_allocation
:
def techshare_allocation(self, tech_dict):
"""
Calculates the baseline population cooking with each technology in each urban and rural square kilometer.
The function takes a stepwise approach to allocating population to each cooking technology:
1. Allocates the population cooking with electricity in each cell based upon the population with access
to electricity.
2. Allocates the population cooking with biogas in each rural cell based upon whether or not there is
biogas potential.
3. Allocates the remaining population proprotionally to other cooking technologies in rural & urban cells.
The number of people cooking with each technology in each urban and rural square km is added as an attribute to
each technology class.
Parameters
---------
tech_dict: Dictionary
The dictionary of technology classses
The function uses the dictionary of technology classes, including biogas collection time, and main GeoDataFrame to do this.
"""
#allocate population in each urban cell to electricity
isurban = self.gdf["IsUrban"] > 20
urban_factor = tech_dict["Electricity"].population_cooking_urban / sum(isurban * self.gdf["Elec_pop_calib"])
tech_dict["Electricity"].pop_sqkm = (isurban) * (self.gdf["Elec_pop_calib"] * urban_factor)
#allocate population in each rural cell to electricity
rural_factor = tech_dict["Electricity"].population_cooking_rural / sum(~isurban * self.gdf["Elec_pop_calib"])
tech_dict["Electricity"].pop_sqkm.loc[~isurban] = (self.gdf["Elec_pop_calib"] * rural_factor)
#create series for biogas same size as dataframe with zeros
tech_dict["Biogas"].pop_sqkm = pd.Series(np.zeros(self.gdf.shape[0]))
#allocate remaining population to biogas in rural areas where there's potential
biogas_factor = tech_dict["Biogas"].population_cooking_rural / (self.gdf["Calibrated_pop"].loc[(tech_dict["Biogas"].time_of_collection!=float('inf')) & ~isurban].sum())
tech_dict["Biogas"].pop_sqkm.loc[(~isurban) & (tech_dict["Biogas"].time_of_collection!=float('inf'))] = self.gdf["Calibrated_pop"] * biogas_factor
pop_diff = (tech_dict["Biogas"].pop_sqkm + tech_dict["Electricity"].pop_sqkm) > self.gdf["Calibrated_pop"]
tech_dict["Biogas"].pop_sqkm.loc[pop_diff] = self.gdf["Calibrated_pop"] - tech_dict["Electricity"].pop_sqkm
#allocate remaining population proportionally to techs other than biogas and electricity
remaining_share = 0
for name, tech in tech_dict.items():
if (name != "Biogas") & (name != "Electricity"):
remaining_share += tech.current_share_rural
remaining_pop = self.gdf.loc[~isurban, "Calibrated_pop"] - (tech_dict["Biogas"].pop_sqkm.loc[~isurban] + tech_dict["Electricity"].pop_sqkm.loc[~isurban])
for name, tech in tech_dict.items():
if (name != "Biogas") & (name != "Electricity"):
tech.pop_sqkm = pd.Series(np.zeros(self.gdf.shape[0])) #ADDED THIS
tech.pop_sqkm.loc[~isurban] = remaining_pop * tech.current_share_rural / remaining_share
#move excess population cooking with technologies other than electricity and biogas to biogas
adjust_cells = np.ones(self.gdf.shape[0], dtype=int)
for name, tech in tech_dict.items():
if name != "Electricity":
adjust_cells &= (tech.pop_sqkm > 0)
for name, tech in tech_dict.items():
if (name != "Electricity") & (name != "Biogas"):
tech_remainingpop = sum(tech.pop_sqkm.loc[~isurban]) - tech.population_cooking_rural
tech.tech_remainingpop = tech_remainingpop
remove_pop = sum(tech.pop_sqkm.loc[(~isurban) & (adjust_cells)])
share_allocate = tech_remainingpop/ remove_pop
self.share_allocate = share_allocate
tech_dict["Biogas"].pop_sqkm.loc[(~isurban) & (adjust_cells)] += tech.pop_sqkm * share_allocate
tech.pop_sqkm.loc[(~isurban) & (adjust_cells)] *= (1 - share_allocate) #what does this line do, this confuses me.
#allocate urban population to technologies
for name, tech in tech_dict.items():
if (name != "Biogas") & (name != "Electricity"):
tech.pop_sqkm.loc[isurban] = 0.0
remaining_urbshare = 0.0
for name, tech in tech_dict.items():
if (name != "Biogas") & (name != "Electricity"):
remaining_urbshare += tech.current_share_urban
remaining_urbpop = self.gdf.loc[isurban, "Calibrated_pop"] - tech_dict["Electricity"].pop_sqkm.loc[isurban]
for name, tech in tech_dict.items():
if (name != "Biogas") & (name != "Electricity"):
tech.pop_sqkm.loc[isurban] = remaining_urbpop * tech.current_share_urban / remaining_urbshare
tech.pop_sqkm = tech.pop_sqkm / self.gdf["Calibrated_pop"]
And one line in set_base_fuel
:
base_fuel.total_time_yr += (tech.total_time_yr * tech.pop_sqkm).fillna(0)
Lets assume we say 2% of the population in rural areas cook with electricity and at the same time we say that the electrification rate in rural areas is 10%.
Should we then make sure that the 2% that cook with electricity in our baseline is amongst the 10% that our tool assume as electrified?