Closed nk027 closed 4 years ago
To add to this:
In this step we potentially overwrite available information and introduce quite some errors. Consider e.g. Armenia in 2005. Our meat-based shares are:
proc share_m
Cattle slaughtering 0.61
Buffaloes slaughtering 0.00
Sheep slaughtering 0.13
Goat slaughtering 0.00
Pigs slaughtering 0.16
Poultry slaughtering 0.08
Horses slaughtering 0.00
Once merged (a subset of) our data here looks as follows:
item production share share_m
Bovine Meat 34400 1.00 0.61
Offals, Edible 10110 0.61
Fats, Animals, Raw 1975 0.61
Hides and skins 5675 0.87 0.61
Meat Meal 0 0.61
Mutton & Goat Meat 91 0.01 0.00
Offals, Edible 10110 0.00
Fats, Animals, Raw 1975 0.00
Hides and skins 0 0.00
Meat Meal 0 0.00
Pigmeat 9400 0.16
Offals, Edible 10110 0.16
Fats, Animals, Raw 1975 0.16
Meat Meal 0 0.16
Eggs 29053 0.08
Poultry Meat 4600 0.08
Offals, Edible 10110 0.08
Fats, Animals, Raw 1975 0.08
Meat Meal 0 0.08
Now as far as I can tell, we ignore the share variable and blindly apply share_m (as long as the processes match). This means we scale down Bovine Meat, Mutton & Goat Meat, Eggs, etc. according to the total split of meat production. This does not make sense. Furthermore, we ignore available information, such as the 0.87% share of Hides and skins stemming from one process or the Mutton & Goat Meat split.
So this may not be as problematic, since we subset to c("Offals, Edible", "Fats, Animals, Raw", "Meat Meal", "Hides and skins")
. Still, we ignore available shares though (note that this seems to only be the case for Hides & Skins).
The supply shares for c("c120", "c121", "c122", "c123", "c124") are calculated based on c("c115", "c116", "c117", "c118", "c119"). The shares are recalculated over all items, even though they are available per input. See #38, might not make sense to have too much detail.
So we use meat supply shares to calculate some other animal product supply shares.
This is done with a subset per year and per region, then we set
shares = value / sum(value)
. Now there's likely cases where no value (i.e. production) is available - leading toNaN
. Atm this is probably not a problem since we NAs and 0s are used interchangeably, but when switching to proper coding ofNA
values we have to watch out.