pacificclimate / scip-backend

Backend that serves salmon population data to the scip webpage
GNU Affero General Public License v3.0
0 stars 0 forks source link

Allow user to specify a species on the region query #6

Closed corviday closed 10 months ago

corviday commented 11 months ago

In order to facilitate users being able to discover areas of interest to them if they don't already have a specific location in mind and know its official name, we would like to be able to provide a species to the region query and see only regions that contain that species. I'm not sure if this should be a species, a subgroup, or both.

Naively, this might involve merging all the conservation units for a species into a single area and returning the regions that overlap with it? But postGIS might offer an easier way.

rod-glover commented 11 months ago

Looking at the db, it looks as if you could do it quite simply by adding a many:many relationship between table Region and table Taxon (or maybe Population, or other table expressing the notion of subgroup). Implemented by a table RegionTaxon with suitable entries (which you'd have to supply). Then your region query is just needs to filter by the appropriate taxon.

corviday commented 11 months ago

At present, population is the canonical relationship between a taxon and a location; each population is a relationship between a conservation_unit (which is a spatial geometry) and a taxon. I don't want to add a formal relationship between region and taxon, as that would just be duplicating the information in population and you'd run into the usual issues of maintaining a non-normalized database.

At present, if the user selects a region to view, we look at what conservation_units overlap that region, and then what populations are in that conservation unit, and which taxons each population is, and report that to the user, which is reasonably straightforward. But running the whole thing the other way - starting with a taxon and getting back to region is trickier, or at least seems so to me.

rod-glover commented 11 months ago

My error, due to not understanding enough. (Aside: This suggests it would be good to add docstrings to each table class, describing these things.)

Given that, I think in outline it is:

species name -> taxon -> population -> conservation unit -> region

with a query something like

select distinct(region.*)
from 
  region 
  join conservation_unit cu on ST_overlap(region.boundary, cu.boundary)
  join population pop on (cu.id = pop.conservation_unit_id) 
  join taxon on (taxon.id = pop.taxon_id)
where
  taxon.common_name = <species name>

You might have to make the that select a CTE (yielding region id's) and then join it to region to get the full set of region descriptors. But that's easy.

Also note that ST_overlap is just my own placeholder for whatever the PostGIS function ought to be. But it's the idea.

Ref. on spatial joins in PostGIS.

rod-glover commented 11 months ago

And of course render that as SQLAlchemy.

corviday commented 11 months ago

I'm not sure if this should be a species, a subgroup, or both.

Markus has indicated this should be a subgroup.