iobis / pyobis

OBIS Python client
https://iobis.github.io/pyobis
MIT License
14 stars 10 forks source link

notebook about sampling bias? #107

Open 7yl4r opened 1 year ago

7yl4r commented 1 year ago

A potentially helpful notebook to include here is a notebook which illustrates the concept of sampling bias. This notebook will present the issues and leave solving them to be dealt with elsewhere.

Things that could be shown:

ayushanand18 commented 1 year ago

This analysis is going to be really really interesting and very very helpful for the community.

Spatial, and taxonomic coverage bias is very prevalent. Although it's hard to found specific taxa that have prominent bias since those are often the ones less studied. Yet while reading a research paper, I found that Pseudodiaptomus pelagicus holds some interesting insights. This research paper titled Effects of temperature on reproduction and survival of the calanoid copepod Pseudodiaptomus pelagicus suggests the best temperature of their survival and growth is 26-30 C while SST distribution on OBIS data shows majority records between 20-25 C. I believe this anomaly might be due to sampling bias.

7yl4r commented 1 year ago

I like your approach of finding well-studied taxonomic patterns from literature, attempting to reproduce the analysis with OBIS data, and seeing how they differ.

ayushanand18 commented 1 year ago

One more interesting species is Ocythoe tuberculata from the monotypic family Ocythoidae, an exotic pelagic octopod family. However, most of its non-Musuem records in OBIS are found very to the coast of Australia.

Ref:

ayushanand18 commented 1 year ago

One more pelagic species I can think of is the common dolphin or Delphinus delphis. But looking at the occurrence records over the west coast of US, they appear to be so much close to the shore than the open ocean.

7yl4r commented 1 year ago

@ayushanand18 : Over on the obisindicators project we have been chatting with folks working on the "America the Beautiful" effort. obisindicators is of interest, but it is critically important to include details about the temporal, spatial, and taxonomic sampling biases in this region.

A notebook that gets at visualization of these biases for the US would be very useful for this.

MathewBiddle commented 1 year ago

If this notebook is built, can we put it in the IOOS Code Lab? I'd like it to be represented well over there.

7yl4r commented 1 year ago

I have decided that the topic of analyzing sampling bias is too large to deal with in a single notebook and it will likely require a lot of libraries that we don't want to depend on here. I have created a new repo to house the notebooks addressing this issue : https://github.com/marinebon/dwc-bias-analysis

In this repo we will import pyobis, pygbif, py-dwc-viz, and others to create sampling bias reports for a spatio-tempora-taxanomic region of interest - starting with US waters as relevant to the America the Beautiful initiative noted above.