osmose-model / osmose-web-api

Web service that generates Osmose configuration files from data sources like Fishbase and SeaLifeBase. Used by https://www.config.osmose-model.org .
MIT License
2 stars 2 forks source link

only first ranked species in functional group are considered for OSMOSE parameter estimation #173

Closed jhpoelen closed 6 years ago

jhpoelen commented 6 years ago

from @agruss2 -

The species composition of the functional groups defined by the rOpenSci API depends on a “data richness” metric. Data richness is calculated on the basis of 18 FishBase/SeaLifeBase parameters. For each species that could potentially be included in a functional group, the rOpenSci API determines whether a value if available (1) or not (0) for each of the 18 FishBase/SeaLifeBase parameters to estimate data richness. For example, if, for a given species, 10 parameter values are available, the data richness of this species will be equal to 10. For each functional group, the species that could potentially be included are ranked based on their data richness and those whose data richness is smaller than 2 are dropped. Then, some other species with the lowest data richness values are eventually dropped so as to keep the number of species per functional group to a maximum of 30, for the sake of computational efficiency.

● Later, the rOpenSci API queries the data that are required to derive OSMOSE parameters in FishBase/SeaLifeBase for the species comprising the defined functional groups (except phytoplankton and zooplankton), but also for some additional, related species (i.e., species that belong to a related species, genus or family). This is implemented so as to maximize one’s chances to obtain non-default values for the largest possible number of OSMOSE input parameters. For each functional group, potential additional species are identified by the rOpenSci API and are ranked based on their data richness, similarly to what is described above. Potential additional species whose data richness is smaller than 2 are dropped. Then, some other potential additional species are eventually dropped so as to keep the total (i.e., original plus additional) number of species per functional group to a maximum of 30, for the sake of computational efficiency. In the final list of species making up a functional group, the ranked list of original species precedes the ranked list of additional species.

● Later, the OSMOSE API employs the data stored in the data archives to derive values for OSMOSE parameters. For each functional group, to generate a value for a given OSMOSE parameter, the OSMOSE API supposedly deals with the species comprising the functional group in turn, based on their rank. The OSMOSE API first considers the first-ranked species and, if FishBase/SeaLifeBase data are available for this species, then a value is calculated for the OSMOSE parameter. The OSMOSE API then considers the second-ranked species and, if FishBase/SeaLifeBase data are available for this species, then a value is calculated for the OSMOSE parameter. This process continues until the OSMOSE API reaches the last-ranked species of the functional group. If no FishBase/SeaLifeBase data are available for all of the species making up the functional group (i.e., original plus additional), then the OSMOSE parameter under consideration is set to its default value. NA (not available) is the default value of 11 of the OSMOSE parameters for which our web service provides estimates.

● However, Skit conducted some tests and noticed that the OSMOSE API considers only the first-ranked species of the final list of species making up each functional group, leading to the generation of NA’s and other default values for a large number of parameters.

jhpoelen commented 6 years ago

@agruss2 @FIN-JBarile please provide a specific example for me to reproduce this issue.

agruss2 commented 6 years ago

@jhpoelen Here are what I am proposing you to do: (1) Select the Gulf of Mexico in the first page of the UI. (2) In page 2 of the UI, deselect all the functional groups expect phytoplankton and zooplankton. (3) Still in page 2 of the UI, create a new focal functional group called "AnchoviesAndSilversides" with the following species: Anchoa mitchilli, Anchoa hepsetus, Menidia beryllina, and Alosa alabamae. (4) In page 5 of the UI, indicate that the number of time steps per year of the OSMOSE model is 12. (5) Observe the list of original plus additional species defined for the "AnchoviesAndSilversides" focal functional group by the rOpenSci FishBase API. (6) Observe how the OSMOSE API behaves when provided with this list of original plus additional species defined for the "AnchoviesAndSilversides" focal functional group. (7) If it turns out that, indeed, the OSMOSE API derives OSMOSE parameter values while only considering the first-ranked species of the "AnchoviesAndSilversides" focal functional group, then make necessary changes in the OSMOSE API so that the OSMOSE API implements what is described above, i.e., "For each functional group, to generate a value for a given OSMOSE parameter, the OSMOSE API deals with the species comprising the functional group in turn, based on their rank. The OSMOSE API first considers the first-ranked species and, if FishBase/SeaLifeBase data are available for this species, then a value is calculated for the OSMOSE parameter. The OSMOSE API then considers the second-ranked species and, if FishBase/SeaLifeBase data are available for this species, then a value is calculated for the OSMOSE parameter. This process continues until the OSMOSE API reaches the last-ranked species of the functional group. If no FishBase/SeaLifeBase data are available for all of the species making up the functional group (i.e., original plus additional), then the OSMOSE parameter under consideration is set to its default value."

Please let me know if this is the information you were expecting from me, if this is clear enough and/or if you need anything else from @FIN-JBarile and me. Again, many thanks for all your help, this is very much appreciated!

jhpoelen commented 6 years ago

Thanks for providing the steps to reproduce the issue. In step (7), could you please provide one or more examples of a currently observed osmose parameter estimates? Also, please provide one or more examples of expected osmose parameter estimates?

agruss2 commented 6 years ago

In terms of observed OSMOSE parameter estimates, please have a look at the attached zip file, which I generated for the West Florida Shelf by querying OSMOSE parameter estimates for the Gulf of Mexico and then redefining functional groups so that they match those represented in the published OSMOSE-WFS models: osmose_config.zip Then, in terms of expected OSMOSE parameter estimates, please let me dig for that; I'll try and come back to you on this asap. Please let me know if you need anything else from me. Many thanks!

jhpoelen commented 6 years ago

Thanks for looking into this. A single combination of expected / observed osmose parameter for a specific group with ranked species should be sufficient for me to reproduce and fix the issue.

agruss2 commented 6 years ago

@jhpoelen The problem is that I cannot provide myself a single combination of expected / observed OSMOSE parameter for a specific group with ranked species, because I do not have any means on my side to know which are the additional species defined by the rOpenSci FishBase API and am also unable to tell you the final ranking of a group that is decided by the rOpenSci FishBase API . Do you have a means to retrieve that information yourself? If so, please let me know and I'll then be able to define that single combination of expected / observed OSMOSE parameter for a specific group with ranked species you need. Otherwise, let's ask Skit to retrieve that information for us. Please let me know what you think. Thanks a lot!

jhpoelen commented 6 years ago

I have reviewed the test cases and code that is responsible for selecting/ calculating osmose parameters by selecting the first available value for a list of ranked species provided in a group by the UI wizard.

I have found no evidence to support the claim that the OSMOSE API does not behave as expected. So, I cannot reproduce the issue with provided information.

@FIN-JBarile please provide detailed steps on how to reproduce the issue you shared. Please include specific expected and actual values for parameter values. Also, please include the list of species in the group that you found the issues with.

FIN-JBarile commented 6 years ago

@jhpoelen @agruss2 I was trying to figure how the OSMOSE API selected the Linf value=63 for Lutjanus campechanus. That I did by generating species for Gulf of Mexico in the UI > added L. campechanus to a functional group with multiple species > completed the steps of the UI. Then I checked the config file output "osm_param-species.csv" for the species.lInf values hoping to find the value for the Red snapper, which I didn't because it was at the bottom of the list. So I repeated the run with Red snapper as the only selected species in the group. That's when I got the value of Linf for said species. With that and checking the other values for Linf in the other functinoal groups, it gave the impression that only the first species in the group is being considered. And because we were also looking at the issue of NA's for a set of species and parameters that Arnaud specified, 20180808_OSMOSE_Processed.zip, I thought it may be a possible cause for the NA's if indeed it's just the parameters (or absence of parameters) of the first species in a group that are considered. At that time, I wasn't able to test the case where the first species doesn't have Linf value, while the succeeding species had Linf entries.

Having discussed with Arnaud the other day though, I had a better understanding of the selection process. I tested again earlier today and looked at the case where the first species has no value for a given parameter. The API indeed gives the next available value, which usually follows the order of listing in the UI. I wasn't able to check if the ranking is based on data richness.

@agruss2 with this, we still haven't resolved the NA's for the species in your list despite having updated the version of the tables supposedly used by the API.

Thanks.

jhpoelen commented 6 years ago

@FIN-JBarile Thanks for your detailed description. As far as I can tell from your description, the API is working as expected. If so, please close this issue. If not, please provide the ordered species list used with the actual and expected Linf values.

agruss2 commented 6 years ago

First off, @FIN-JBarile, many thanks for having devoted some significant time running tests, and, @jhpoelen, thank you so much for all your help ! It indeed turns out that the API is working as expected. I can, therefore, close the present issue. This being said, Skit is right that the issue of NA's for the 17 parameters I identified is not solved and must be solved within the coming three weeks (in addition to solving #172). I am going to follow up on all this in detail this weekend. Meanwhile, thanks a lot to you both once again.