osmose-model / osmose-web-api

Web service that generates Osmose configuration files from data sources like Fishbase and SeaLifeBase. Used by https://www.config.osmose-model.org .
MIT License
2 stars 2 forks source link

add the schema documentation to ropensci/fishbaseapi #27

Closed jhpoelen closed 8 years ago

jhpoelen commented 8 years ago

https://github.com/ropensci/fishbaseapi/

jhpoelen commented 8 years ago

@sckott would you be interested in getting a description of what all the fishbase tables columns mean? It seems that that some fishbase members are willing to share this openly. I was hoping we can put it in a location so that other fishbaseapi users can also see it. If this information is already available, please do let me know, so that we can avoid double work.

sckott commented 8 years ago

We don't have that information yet. I think we've asked for it. Do you know who can share it?

Once we have it, we already have an API route for that information, http://ropensci.github.io/fishbaseapidocs/#docs-by-table - but it only lists field names for now

the data for the /docs/<table> route comes from https://github.com/ropensci/fishbaseapi/tree/master/docs/docs-sources

jhpoelen commented 8 years ago

@agruss2 has the information in the form of an excel sheet that contains a schema query from the microsoft access fishbase database. This excel sheet was provided by @FIN-JBarile through a shared dropbox folder. I have attached the most recent version of that excel sheet to this issue.

@sckott what do you think is the most efficient way to merge this into https://github.com/ropensci/fishbaseapi/tree/master/docs/docs-sources ?

FishBase_DataDictionary.xls.zip

jhpoelen commented 8 years ago

@sckott a quick glance at this data dictionary tells me that only a few tables are included, specific to this project . . . perhaps a good first step towards documenting all of the tables?

sckott commented 8 years ago

@jhpoelen thanks. right, only a few tables. i can just write merge together, hopefully column names are the same. and I'll get it up on the public API soon

sckott commented 8 years ago

those routes are updated for the sheets avail. in that xcel file

sckott commented 8 years ago

e.g.,

curl -v 'https://fishbase.ropensci.org/docs/popchar/' |  jq .data\[].description
#> "Record number; internal auto counter\n"
#> "Code number for internal use"
#> "Code number for internal use"
#> "Reference from which info on maximum size;  weight or age was entered."
#> "Sex of the fishes that the data in this record refer to"
#> "Original publication containing info on maximum size;  weight or age"
#> "Weight of heaviest individual recorded from a stock"
#> "Type of wet weight that Wmax is referring to"
#> "Length of longest individual recorded from a stock"
#> "Type of length measurement used for Lmax. Width is used for rays."
#> "Age of oldest fish reported from a stock."
#> "Area from where the specimen was collected"
#> "Three-digit UN numerical country or area code"
#> "Desription of length or weight measurement if nit in choice list"
#> "Fill-in if max weight and max length refer to the same fish"
#> "Fill-in if max length and max age refer to the same fish"
#> "Code number of person who entered the data"
#> "Date when the record was first entered"
#> "Code number of person who modified the data"
#> "Date when the record was modified"
#> "Code number of person who checked the data"
#> "Date when the record was checked by an expert"
jhpoelen commented 8 years ago

wow! that was fast. I was able to reproduce your example. When I tried another table (popgrowth), I noticed a server error (see below). Is this expected?

$ curl -v 'https://fishbase.ropensci.org/docs/popgrowth/' |  jq .data\[].description
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 52.32.222.224...
* Connected to fishbase.ropensci.org (52.32.222.224) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate: fishbase.ropensci.org
* Server certificate: Let's Encrypt Authority X1
* Server certificate: DST Root CA X3
> GET /docs/popgrowth/ HTTP/1.1
> Host: fishbase.ropensci.org
> User-Agent: curl/7.43.0
> Accept: */*
> 
< HTTP/1.1 500 Internal Server Error
< Content-Length: 4730
< Content-Type: text/plain
< Date: Wed, 16 Mar 2016 01:16:45 GMT
< Server: Caddy
< Status: 500 Internal Server Error
< 
{ [3923 bytes data]
100  4730  100  4730    0     0  11409      0 --:--:-- --:--:-- --:--:-- 11650
* Connection #0 to host fishbase.ropensci.org left intact
parse error: Invalid numeric literal at line 1, column 14
sckott commented 8 years ago

thanks, i'll look

sckott commented 8 years ago

think it's fixed

jhpoelen commented 8 years ago

@sckott thanks, looks great! Hopefully, we can get to more complete documentation of the table as the information is released . . . for now, I am closing this issue.

curl -v 'https://fishbase.ropensci.org/docs/popgrowth/' |  jq .data\[].description
"Record number; internal auto counter"
"Code number for internal use"
"Code number for internal use"
"Code number of ecosystem for internal use."
"Reference from which growth parameters were extracted"
"Sex of fish to which data in this record refer"
"Type of data on which the estimates are based"
"Reference containing raw data used for estimation of parameters"
"Asymptotic length;  a parameter of the von Bertalanffy growth equation"
"Number of specimens in the growth study."
"Coefficient of determination for the growth study."
"Standard error of Loo"
"Standard deviation of Loo"
"95% lower confidence limit of Loo"
"95% upper confidence limit of Loo"
"Assumed distribution of Loo (normal;  log-normal)"
"Observed or converted total length;  except for scombroids where fork length is used."
"Rate at which the asymptotic size is approached"
"Standard error of K"
"Standard deviation of K"
"95% lower confidence limit of K"
"95% upper confidence limit of K"
"Assumed distribution of K (normal;  log-normal)"
"Arbitrary parameter to position a growth curve relative to origin of coordinate system"
"Standard error of to"
"Standard deviation of to"
"95% lower confidence limit of to"
"95% upper confidence limit of to"
"Length measurement;  e.g. TL (total length)."
"Method used to estimate Linfinity;  K and  to"
"Asymptotic weight computed using a L/W relationship or as given in ref. for growth."
"Is the L inf. more than 30% below or above the maximum length stated in the SPECIES table? Note that length type was not considered."
"Do Linf. & K fall within the 95% confidence interval defined by the ellipse?"
"Gross check on growth estimates using ratio of log K to log Loo."
"Source of Winfinity"
"Exponent of L/W relationship used in computation of Winfinity."
"Amplitude of seasonal growth oscillation."
"Maximum age reported for a specimen of this population;  in years."
"Reference used for tmax."
"Mean or median age at first maturity."
"Component of mortality not caused by the fishery (units;  1/year)"
"Method used to estimate natural mortality"
"Is the estimate of natural mortality doubtful?"
"Reference containing estimate of natural mortality (M)"
"Number of specimens in the mortality study."
"Coefficient of determination for the mortality study."
"Standard error of M"
"Standard deviation of M"
"95% lower confidence limit of M"
"95% upper confidence limit of M"
"Assumed distribution of M (normal;  log-normal)"
"Mean length at first maturity"
"The reproductive load ;  the fraction Lm/Linfinity"
"Sex of fish  to which the maturity data refer."
"Type of length measurement"
"Reference containing data on mean length at maturity (Lm)"
"Mean length at first maturity for males"
"The reproductive load for males"
"Mean length at first maturity for females"
"The reproductive load for females"
"Locality where the data were collected"
"Start of sampling period. Put publication year;  if sampling year is not given."
"End of sampling period."
"Whether year pertains to sampling year or publication year."
"3-digit UN numerical country or area code"
"Distinguishes growth in nature from growth in captivity."
"Mean annual water temperature for the indicated locality in C"
"Difference between summer and winter temperature for the locality."
"Reference providing temperature data for the locality."
"Intrinsic rate of population increase"
"Remarks on the quality of data"
"NA"
"Code number of person who entered the data"
"Date when the record was first entered"
"Code number of person who modified the data"
"Date when the record was modified"
"Code number of person who checked the data"
"Date when the record was checked by an expert"
sckott commented 8 years ago

great