Exploring fish-dominant higher impact in soybean dominated areas

gclawson1 commented 5 months ago

Why does it appear that the fish-dominant diet has higher impacts in places like Brazil, Argentina, and the USA, where there is a lot of soybean production? See the below delta plot, you'll notice there are many dark blue areas in these areas:

Presumably the plant-dominant diet would have higher impacts given that 1) SPC and soy oil account for a higher percentage of the plant-dominant diet than SBM does in the fish-dominant (thus there should be more demand and production of SPC), 2) SPC has a higher allocation factor (so there should be higher pressures, and presumably higher impacts).

I've looked over the allocation factors, and everything looks OK there. SPC is ~1.1 (soy oil ~1.72) while SBM is ~0.8.
The demand looks ok, there is more demand for SPC than SBM across scenarios.
The production looks ok, there is more production for SPC than SBM across scenarios.
The actual km2 of pressures look ok (before and after reprojection), ~1000 km2 more soy pressures under plant-dominant (see below):

#     Plant-dominant                           sum km2
# Rapeseed:canola oil:economic:A             3350.17145
# Soybean:soy protein concentrate:economic:A 2482.72642
# Wheat:wheat gluten:economic:A              1954.18908
# Pulses:pea protein concentrate:economic:A   572.73846
# Pulses:faba beans:economic:A                570.68376
# Wheat:wheat:economic:A                      507.38673
# Pulses:pea flour:economic:A                 466.24491
# CropsNES:linseed oil:economic:A             439.87800
# Sunflower:sunflower meal:economic:A         315.45644
# Pulses:guar meal:economic:A                 305.56726
# CropsNES:coconut oil:economic:A              81.27246
# Soybean:soy oil:economic:A                   73.95469
# Maize:corn gluten meal:economic:A            14.18603

#    Fish-dominant                       sum km2
# Wheat:wheat gluten:economic:A     2592.29164
# Soybean:soybean meal:economic:A   1578.01008
# Wheat:wheat:economic:A             546.41648
# Pulses:faba beans:economic:A       317.04654
# Maize:corn gluten meal:economic:A   81.06304

Here are the pressure maps:

SBM (fish-dominant):

SPC (plant-dominant:

It doesn't have anything to do with vulnerability, as vulnerability is treated the same to species, regardless of if it is SPC or SBM.
Looking at impacts (aggregated from species level impacts), there is still a higher km2 of impact (and mean extinction risk) for SPC than SBM:

# A tibble: 3 × 6
  diet           fcr_type allocation ingredient                      total_impact_km2 mean_ext_risk
  <chr>          <chr>    <chr>      <chr>                                  <dbl>         <dbl>
1 plant-dominant regular  economic   Soybean_soy protein concentrate      166387.  0.00000170  
2 fish-dominant  regular  economic   Soybean_soybean meal                 105755.  0.00000108  
3 plant-dominant regular  economic   Soybean_soy oil                        4956.  0.0000000507

I isolated the mean proportion of habitat impacted maps for the soy protein concetrate compared to the soybean meal, and it appears that the plant-dominant diet (SPC) always has larger impacts than the fish dominant diet (SBM)... which would make sense!:

What this indicates to me is that this is an artifact of the species weighted averaging across taxon and raw materials. Particularly when I average the impacts between SPC and soybean oil for the plant-dominant scenario (since the plant-dominant scenario has both)... If we look at the same map, but with SPC and soy oil averaged for the plant-dominant, then the fish-dominant ends up having higher average impacts for soybeans

Currently, I calculate for each cell, for each species within each taxanomic grouping and raw material/ingredient, the proportion of habitat impacted. Then I take the average of that, and save that, ending up with a raster that describes for a particular taxon and ingredient/raw material, the average proportion of habitat impacted in each cell. For example, I have a raster that is the average proportion of habitat impacted for Birds impacted by soy protein concentrate in each cell (see below):

Then from there, I calculate a species weighted average of proportion of habitat impacted in each cell across all of the taxanomic and raw material combinations, which is where we get the weird SBM vs SPC problem. I think that we see artifacts of this for soybeans, wheat, pulses, and "other crops", as they have multiple ingredients associated with their raw materials (and thus, lots of overlap in cells with "large" and "small" impacts). But obviously soybeans are the big problem considering there is such a large difference of values between SPC and soybean oil in the plant-dominant diet, without much overlap with other crops.

So, the question arises, should I change my methodology?

The other way I could conceive of accomplishing calculating impacts would be to:

save a total km2 of impact raster for each taxanomic group and raw material combination.
- So I would have a raster which describes the total amount of habitat impacted for birds by soy protein concentrate under a plant-dominant diet in each cell
- save a total km2 of habitat raster for each taxanomic group in each cell
- then, instead of calculating means, I calculate sums of impacted habitat / sum of habitat area across all taxon/raw material combinations.
- Only concern here is that this is essentially a total impact measurement, rather than average.

gclawson1 commented 5 months ago

Also, some of the blue in Canada and the US could be due to wheat impacts. Some of the blue in Brazil and USA could be corn.

bshalpern commented 5 months ago

good sleuthing, and I was wondering about these results. I need to process more the question about if/whether to change the methods. Do you have an intuition of which path makes sense? I do worry that the results as they are will be a red flag for reviewer critique...

gclawson1 commented 5 months ago

I think we have some options:

Keep it the way I currently have it; the "average proportion of habitat impacted" method
- Calculate proportion of habitat impacted for each species in each cell for each ingredient
- Average that proportion of habitat across species within taxon (e.g., resultant raster is Birds, SPC average prop of habitat impacted in each cell)
- nspp weighted average across taxon and ingredients (and other combinations of that, like across taxon by ingredient, by taxon across ingredient, etc.)
The "average total proportion of habitat impacted" method
- Calculate km2 of impact for each species in each cell for each ingredient
- Sum that km2 across species within taxon (e.g., resultant raster is Birds, SPC total km2 of impact in each cell)
- Divide that sum by the sum of habitat area for the bird taxon in each cell
- nspp weighted average across taxon and ingredients (and other combinations of that, like across taxon by ingredient, by taxon across ingredient, etc.)
No averaging at all, just use sums and division; the "total proportion of habitat impacted" method
- Calculate km2 of impact for each species in each cell for each ingredient
- Sum that km2 across species within taxon (e.g., resultant raster is Birds, SPC total km2 of impact in each cell)
- Sum km2 across taxon and ingredients (and other combinations of that, like across taxon by ingredient, by taxon across ingredient, etc.)
- Divide by total habitat area in each cell (depending on how the data is grouped could be global across all taxon or taxon specific)

I would be partial to options 1 or 2, as there is precedent for taking mean percentage change in habitat area within each cell. Williams et al., 2020 reports their results as a mean per taxon. They also present a "change in total habitat (mean habitat loss in a cell multiplied by the number of species present)."

The only concern I have about the averaging options is that I don't think changing to option 2 will solve the problem I've described in previous comments. Because for the plant-dominant scenario, cells where there are only impacts from SPC and soybean oil, everything is equal (nspp exposed and impacted, where pressures are located) aside from the amount of pressure, meaning it would just be a regular average. I think we'd still see that the soybean oil is bringing down the values in those cells, making impacts less than the fish-dominant scenario (which only has SBM impacts).

I think the only way to avoid that is to basically scrap being able to say anything at the ingredient level, and only group, sum, and report results at a raw material level. e.g., change options 1 or 2 to have a sum that would end up with a raster that is Birds, SOYBEANS total km2 of impact in each cell and then average across taxon, raw materials, etc.
Or I just remove soybean oil from the ingredients... although we would still see similar issues (albeit not as extreme) with other raw materials that have multiple ingredients; wheat and pulses. "Other crops" has two ingredients, but they have equal amount of contribution in the plant-dominant diet.
I don't believe we really see this problem with the FMFO, as there is not much overlap in cells where trimmings fish and forage fish are caught, and the forage fish catch dwarfs the trimmings fish catch, but that is just my assumption.

cottrellr commented 5 months ago

My feeling is because of averaging across ingredients for the same raw material - as you point out in your bullet above this is maybe what is causing the distortion. We are implicitly saying that SPC and soy oil can come from the same unit of production in a given map so averaging impacts across these ingredients doesn't quite make sense. Because we are looking at the production of soybeans needed that creates the impact. And given we are not dealing with processing disturbance pressures at all (with good reason as our other paper shows), maybe raw material is the focus we need. Soybeans from Brazil or US or Australia is what we are saying creates the impacts not SPC from Brazil, US or Australia. So yeah, I think summing all soybeans or wheat km2 for a given diet in a given cell is what matters and then the species weighting can happen. With the species held the same perhaps this would rectify the issue? Unless I'm missing something.

gclawson1 commented 5 months ago

Yeah that makes sense, and I think that would work to fix the issue. I think I just need to add/change a step in where we aggregate impacts to the raw material, rather than average. Something like this:

the "average proportion of habitat impacted" method
- Calculate proportion of habitat impacted for each species in each cell for each RAW MATERIAL
- Average that proportion of habitat across species within taxon (e.g., resultant raster is Birds, SOYBEANS average prop of habitat impacted in each cell)
- nspp weighted average across taxon and ingredients (and other combinations of that, like across taxon by raw material, by taxon across material, etc.)

cottrellr commented 5 months ago

Cool. If you retained the proportion of raw material that each ingredient represent in terms of km2 (i.e., have a column that has raw_material_prop e.g., based on your plant dominant area estimates SPC might get at prop of ~0.97 and soy oil ~0.03) you could still keep impacts later on to the ingredient level if you wished….

Richard S. Cottrell Research Fellow in Aquaculture Sustainability Institute for Marine and Antarctic Studies College of Sciences and Engineering University of Tasmania

Theme Co-Lead, Sustainable Futures and Planetary Health Centre for Marine Socioecology University of Tasmania

Size Ecology Labhttps://www.sizeecology.org/ | Centre for Marine Socioecologyhttps://marinesocioecology.org/themes/sustainable-futures-and-planetary-health/ Google Scholarhttps://scholar.google.com/citations?user=X1a9t90AAAAJ&hl=en&authuser=1 | ORCIDhttps://orcid.org/my-orcid?orcid=0000-0002-6499-7503 | @RichCottrell22https://twitter.com/RichCottrell22

From: Gage Clawson @.> Date: Thursday, 18 April 2024 at 10:54 AM To: Sustainable-Aquafeeds-Project/feed_biodiv_impact_mapping @.> Cc: Richard Cottrell @.>, Comment @.> Subject: Re: [Sustainable-Aquafeeds-Project/feed_biodiv_impact_mapping] Exploring fish-dominant higher impact in soybean dominated areas (Issue #18)

Yeah that makes sense, and I think that would work to fix the issue. I think I just need to add/change a step in where we aggregate impacts to the raw material, rather than average. Something like this:

the "average proportion of habitat impacted" method
- Calculate proportion of habitat impacted for each species in each cell for each RAW MATERIAL
- Average that proportion of habitat across species within taxon (e.g., resultant raster is Birds, SOYBEANS average prop of habitat impacted in each cell)
- nspp weighted average across taxon and ingredients (and other combinations of that, like across taxon by ingredient, by taxon across ingredient, etc.)

— Reply to this email directly, view it on GitHubhttps://github.com/Sustainable-Aquafeeds-Project/feed_biodiv_impact_mapping/issues/18#issuecomment-2062795403, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AJK3YJG75HCVQB5EHT7EGTDY54KTJAVCNFSM6AAAAABGKPMTDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRSG44TKNBQGM. You are receiving this because you commented.Message ID: @.***>

This email is confidential, and is for the intended recipient only. Access, disclosure, copying, distribution, or reliance on any of it by anyone outside the intended recipient organisation is prohibited and may be a criminal offence. Please delete if obtained in error and email confirmation to the sender. The views expressed in this email are not necessarily the views of the University of Tasmania, unless clearly intended otherwise.

gclawson1 commented 5 months ago

@cottrellr based on our conversation a few mins ago, is this what you are suggesting to show for that global map?

sum of km2 impacts across all species and materials
divide that by the total km2 of habitat of species which are impacted
result is a map that shows proportion of habitat impacted in each cell across all species and materials

If so, where does the species weighting come in?

cottrellr commented 5 months ago

So I think for each cell summing across all materials but per species. For each species divide by total km2 in cell. Then a mean for each cell. Sorry not species weighted then, that would just be a mean.

gclawson1 commented 5 months ago

Hmmm ok.. I need to think about how to accomplish this.

So I would end up with a raster describing this for example?:

Birds taxa: Average proportion of habitat impacted across all materials

Then from there I could take a nspp weighted mean for each cell to get across all taxa?

I'm not sure that I can do it (computationally) without chunking it into taxa first because there are so many species

cottrellr commented 5 months ago

Oh I was thinking not by taxa but all species. Yeah but I see that each cell and each species gets very big very quickly. Is there a way to chunk the analysis but save into lists or folders? That way all layers can be pulled back in and the mean taken by multicore terra function like lapp or app (never remember which one it is).

gclawson1 commented 5 months ago

I'm honestly not sure how I could make that work... I already chunk it so much (by taxa and per every 500k cells).

I'm just looking at amphibians right now for example, and for soybeans alone if I tried to save it per species it would be a list of 14 lists of dataframes, and the max number of rows it could be in each of those data frames is ~1840 species * 27000 cells impacted = ~49 million rows

Then from there extrapolate out to the 14 dataframes with 49 million rows* 14 data frames = ~700 million rows I would need to save

And extrapolate that out from soybeans and amphibians to all of the other raw materials (even if we just sum the raw materials to a total pressure map) and species, there would be a lot more cells to add to that... especially considering how dispersed the fishing pressures are, so the marine side of things would be an entirely different beast.

^^ That math isn't exactly correct/not sure if that even makes sense... but just trying to illustrate the mess

cottrellr commented 5 months ago

Does what you propose above address the issue? i.e.,:

sum of km2 impacts across all species and materials
divide that by the total km2 of habitat of species which are impacted
result is a map that shows proportion of habitat impacted in each cell across all species and materials

cottrellr commented 5 months ago

Maybe weighted by taxa could work nicely?

gclawson1 commented 5 months ago

Yeah I think it would; there's just not averaging involved there. That would be showing a map of total km2 impact / total AOH in each cell

gclawson1 commented 5 months ago

For those following along - Rich and I decided on a plan of attack and I've nearly finished updating the analysis with this new methodology.

Before, I was taking averages by taxa and then averaging again across taxa (and other combinations of things, depending on what I wanted to present). I was basing my methodology on a lot of stuff done by Casey, but in our case we don't think it is appropriate to take averages of averages since we aren't rescaling in the same way as Casey (I.e., mean cumulative impact vs mean proportion of habitat impacted).

We figured out how to save species level km2 of impact. From there I am able to calculate proportion of habitat impacted (and extinction risk, just for fun) on a species level, and then average across taxa and raw material (or other combinations like by taxa across material, by material across taxa, etc). This methodology makes more sense and removes some uncertainty associated with taking averages of averages. I've also coded up to save standard deviations and number of species impacted maps.

I have all of these impacts calculated across taxa and ingredient, so hopefully I can update with a new global map soon and we'll see some of the problems discussed above fixed. Once those are done, I'll do other combinations of averaging for SI materials (across taxa by material, across material by taxa, etc).

gclawson1 commented 5 months ago

Updated plot with new methodology - and the problem described above is fixed!!

I'll post an update with the other part of this map and some explanation of what the methodology changed in the results in a separate issue (spoiler alert - maximum average impacts are much higher. A cell in plant-dominant has a average prop impacted of 0.22!)

Sustainable-Aquafeeds-Project / feed_biodiv_impact_mapping

Exploring fish-dominant higher impact in soybean dominated areas #18