Open atcooper1 opened 5 months ago
Biomass coefficients: The A's and b's value displayed in the database are the result of the handling of numbers as double in the script to generate to SQL code for the update. So again, because the values were handled as double in the script you see what you see is not exactly what you expected: | script | expected | |
---|---|---|---|
a | 0.00954992976039648 | 0.00955 | |
b | 3.049999952316284 | 3.05 |
Would you like to see the values of 'a' and 'b' being rounded? You might also recall that the values we ingested in 2022 were not the latest available on the Fishbase pages, as the most recent version was not publicly accessible on the website.
Rarity unfortunately, since rarity statistics are computed metrics, it is difficult to determine if they have been affected by the rounding issue, as we have no reference for comparison.
I think rounding would be helpful, especially when copying a's and b's for superseded species.
A's rounded to 5 decimals? is it consistently the case though? B's rounded to 2 decimals?
Yes, a's to 5 decimals, b's to 2. Thanks, Bene
From conversation 06/06/2024:
Bene : After examining the Rfishbase package more closely, it appears that an updated version of the database from May 2023 is available. If this update is indeed available( i need to look at the data), and considering that I've planned to re-ingest rounded biomass coefficients in the DB, I assume you would prefer the latest version to be ingested. Toni: Yes, that would be great if possible please?
Decision: update biomass coefficent to the latest Fishbase release and apply the rounding as agreed
SQL update was applied (ref https://github.com/aodn/nrmn-application/pull/1374) From testing , Toni identified discrepancies between updated values and Fishbase website. Values in the update were from the fb_parquet_2023-05 release in this repo https://github.com/cboettig/rfishbase_board/, the same repo as last update. This repos was thought to be the source of the Fishbase dataset. However, after a web research another data source for Fishbase was found with a more recent release(release24.07) and value in agreement with FB website here. And more specifically : https://huggingface.co/api/datasets/cboettig/fishbase/tree/main/data/fb/v24.07/parquet
The update script will be re-generate.
a and b values, plus trait values may be affected by the same rounding/double precision issue that is currently affecting lat/long. Eg. Zoramia leptacanthus Some a and b values also don't seem to be reflecting what is on Fishbase - might be worth checking the FB file that was ingested in 2022?