EnvironmentOntology / gaz

An open source gazetteer constructed on ontological principles
Other
7 stars 5 forks source link

Marine Regions Gazetteer #29

Open kaiiam opened 5 years ago

kaiiam commented 5 years ago

Question migrated over from #23

Perhaps lower priority issue for now, but do you think there could be any scope to link GAZ to the Marine Regions Gazetteer it's a pretty extensive resource which has term hierarchies for Large Marine Ecosystems of the World, Longhurst Provinces etc. This could be quite a useful resource for the Oceans and Seas module.

cmungall commented 5 years ago

This looks a pretty phenomenal resource. But very hard to automate anything in its current form. It looks like a catalog of HTML links to other resources each with their own formats and download URLs, some requiring entering email etc. I wonder if anyone else has tried mapping these to RDF, or even wikidata. If it was all available in a single structured download it would be much easier

kaiiam commented 5 years ago

@cmungall agreed, it would be a lot of work to do manually. They do have a webservices page with RESTful and SOAP services. Perhaps that could help to automate it.

cmungall commented 5 years ago

OK, we could get started with mapping their gazetteer types to ENVO:

curl -L -s http://marineregions.org/rest/getGazetteerTypes.json/ | jq -r '.[] | [.type, .description] | @tsv'

showing a subset here:

type description
Town The lowest administrative unit in Belgium and the Netherlands.
Arrondissement The administrative unit in Belgium that lies between 'Gemeente' and Province.
Department A high-level administrative unit that is used in a lot of countries.
Province (administrative) A medium-level administrative unit that is used in a lot of countries.
Country A high-level administrative unit that only has been used for the four divisions of the United Kingdom.
Continent The highest subdivision of the world. In VLIMAR, their are 7 continents distinguished: Africa, Antarctica, Asia, Europe, North America, Oceania, South America.
Region A high-level administrative unit that is used in a lot of countries.
Ward The lowest administrative unit in the United Kingdom.
Commune The lowest administrative unit in a lot of countries.
District A medium-level administrative unit that is used in a lot of countries.
Canton A medium-level administrative unit that is used in France.
Sub-Province A medium-level administrative unit that is used in the Netherlands.
Nation The highest administrative unit.
County A medium-level administrative unit that is used in a lot of countries.
Unitary Authority A medium-level administrative unit that is used in a lot of countries.
Borough A medium-level administrative unit that is used in a lot of countries.
World The world
Ocean Very large sea between different continents.
Sea A large surface salt water that covers a large part of the world
Dependent State A state that is dependent on another nation.
Island A land mass that is totally enclosed by water and that doesn't form a continent.
Former Nation A nation that doesn't exist anymore.
Gulf A large bay.
Basin A depression, in the sea floor, more or less equidimensional in plan and of variable extent.
Current A moving watermass.
Water mass A large amount of water.
Strait A narrow, natural connection between seas.
Sandbank Shallow accumulation of sand that rises never above sealevel but alltough forms a substantial hazard for navigation.
Ridge (a) An elongated narrow elevation of varying complexity having steep sides. (b) An elongated narrow elevation, often separating ocean BASINS. (c) The linked major mid-oceanic mountain systems of global extent. Also called MIDOCEANIC RIDGE.
Channel A narrow, natural connection between seas.
Front The dividing line between different water masses.
Bight Round indentation of the sea in the continent.
Field
Ground
Sandbank System The aggregate of adjoining sandbanks and swales.
Deep A small depression in the seafloor.
Plateau A flat or nearly flat elevation of considerable areal extent, dropping off abruptly on one or more sides.
Bay Round indentation of the sea in the continent.
Island Group Group of nearby lying islands that form a geographical entity.
Archipelago Group of nearby lying islands that form a geographical entity.
Deelgemeente The subdivision of a 'Gemeente' in Belgium and the Netherlands.
Lake A basin filled with water that is totally enclosed by water.
River A natural water current that always flows out in another river or stream but never in a sea.
Lagoon A small lake that is separated from the sea by a long and narrow tongue of land.
Fjord Narrow, deep inlet in a mountainous coast that came into being during the ice age.
General Region Area with a specific name that can't be classified otherwise.
Stream A river that flows out in the sea.
Estuary River mouth under tidal influence.
Swale Depression between two sandbanks.
Canal Man-made waterway.
Fracture Zone An extensive linear zone of irregular topography, mountainous or faulted, characterized by steepsided or asymmetrical ridges, clefts, troughs or escarpments.
Sound A narrow, natural connection between seas.
Seamount(s) A discrete (or group of) large isolated elevation(s), greater than 1,000m in relief above the sea floor, characteristically of conical form.
Canyon(s) A relatively narrow, deep depression (or group of depressions) with steep sides, the bottom of which generally deepens continuously, developed characteristically on some continental slopes.
Trough A long depression of the sea floor characteristically flat bottomed and steep sided and normally shallower than a trench.
Spur A subordinate elevation or ridge protruding from a larger feature, such as a plateau or island foundation.
Slope The deepening sea floor out from the shelfedge to the upper limit of the continental rise, or the point where there is a general decrease in steepness.
Hill(s) An isolated (or group of) elevation(s), smaller than a SEAMOUNT.
Plain An extensive, flat, gently sloping or nearly level undersea region.
Caldera A collapsed or partially-collapsed seamount, commonly of annular shape.
Guyot A seamount having a comparatively smooth flat top. Also called tablemount.
Abyssal Plain An extensive, flat, gently sloping or nearly level region at abyssal depths.
Coast Part of the land that is adjacent to the sea.
EEZ In international maritime law, an exclusive economic zone (EEZ) is a seazone extending from a state's coast over which the state has special rights over the exploration and use of marine resources. Generally a state's EEZ extends 200 nautical miles out from its coast, except where resulting points would be closer to another country.
Inhabited Place Place with permanent or temporary habitation.
Inlet Small indentation of the sea in the continent.
Hole A small local depression, often steep sided, in the sea floor.
General Sea Area Sea area with a specific name that can't be classified otherwise.
Delta Branching river mouth.
State A high-level administrative unit that is used in a lot of countries.
Bank An elevation of the sea floor, over which the depth of water is relatively shallow, but sufficient for safe surface navigation.
Seachannel A continuously sloping elongated discrete depression found in fans or abyssal plains and customarily bordered by levees on one or both sides.
Escarpment An elongated, characteristically linear, steep slope separating horizontal or gently sloping sectors of the sea floor in non-shelf areas. Also abbreviated to scarp.
Valley A relatively shallow, wide depression, the bottom of which usually has a continuous gradient. This term is generally not used for features that have canyon-like characteristics for a significant portion of their extent. Also called SUBMARINE VALLEY or SEA VALLEY.
Knoll(s) An elevation somewhat smaller than a SEAMOUNT and of rounded profile, characteristically isolated or as a cluster on the sea floor.
Fan A relatively smooth, fan-like, depositional feature normally sloping away from the outer termination of a canyon or canyon system. Also called cone.
Seamount Chain A linear or arcuate alignment of discrete seamounts, with their bases clearly separated.
Reef A mass of rock or other indurated material lying at or near the sea surface that may constitute a hazard to surface navigation.
Rise (a) A broad elevation that rises gently and generally smoothly from the sea floor. (b) The linked major mid-oceanic mountain systems of global extent. Also called midoceanic ridge.
Trench A long narrow, characteristically very deep and asymmetrical depression of the sea floor, with relatively steep sides.
Continental Shelf A zone adjacent to a continent (or around an island) and extending from the low water line to a depth at which there is usually a marked increase of slope towards oceanic depths.
Oil Field Sea area where oil is drilled.
Gas Field Sea area where gas is drilled.
Autonomous Region Part of a country that has certain autonomy.
Apron A gently dipping surface, underlain primarily by sediment, at the base of any steeper slope. ACUF defines it as 'a gentle slope with a generally smooth surface of the sea floor, characteristically found around groups of islands or seamounts.'
Borderland A region adjacent to a continent, normally occupied by or bordering a shelf and sometimes emerging as islands, that is irregular or blocky in plan or profile, with depths well in excess of those typical of a shelf.
Continental Margin The zone, generally consisting of shelf, slope and continental rise, separating the continent from the deep sea floor or abyssal plain. Occasionally a trench may be present in place of a continental rise.
Levee A depositional natural embankment bordering a canyon, valley or seachannel on the ocean floor.
Moat An annular depression that may not be continuous, located at the base of many SEAMOUNTS, oceanic islands and other isolated elevations.
Passage A narrow break in a RIDGE or a RISE. Also called GAP.
Peak A prominent elevation either pointed or of a very limited extent across the summit.
Pinnacle Any high tower or spire-shaped pillar of rock, or coral, alone or cresting a summit.
Promontory A major SPUR-like protrusion of the continental SLOPE extending to the deep seafloor. Characteristically, the crest deepens seaward.
Saddle A broad pass or col, resembling in shape a riding saddle, in a RIDGE or between contiguous elevations.
Shelf Edge The line along which there is marked increase of slope at the seaward margin of a CONTINENTAL (or island) SHELF. Also called SHELF BREAK.
Shoal An offshore hazard to surface navigation with substantially less clearance than the surrounding area and composed of unconsolidated material.
Sill A sea floor barrier of relatively shallow depth restricting water movement between BASINS.
Terrace A relatively flat horizontal or gently inclined surface, sometimes long and narrow, which is bounded by a steeper ascending slope on one side and by a steeper descending slope on the opposite side.
Dike
Mud Flat
Salt Marsh
Wad
Polder
Harbour
Dock
Sluice
Continental Slope
Submarine lava tube
Cave
Natural Reserve
Marine Park
Cape
Sampling Station Standard sampling sation
Peninsula
Flat Nonspeciefic 'flat' marine area.
Cliffs a significant vertical, or near vertical, rock exposure
Dunes
Former administrative division
Reservoir
National Park
National District
Mountain range a series of mountains
Land basin
Gap
Seamount Province
Tablemount
Zone
Shelf The flat or gently sloping region adjacent to a continent or around an island that extends from the low water line to a depth, generally about 200m, where there is a marked increase in downward slope.
cmungall commented 5 years ago

For any of the types you can query:

http://marineregions.org/rest/getGazetteerRecordsByType.json/Archipelago/0

gives records like:


  {
    "MRGID": 2464,
    "gazetteerSource": "ASFA thesaurus",
    "placeType": "Archipelago",
    "latitude": 49.44915,
    "longitude": -2.349499,
    "minLatitude": 49.1649,
    "minLongitude": -2.679199,
    "maxLatitude": 49.7334,
    "maxLongitude": -2.019799,
    "precision": 39597.16,
    "preferredGazetteerName": "Channel Islands",
    "preferredGazetteerNameLang": "English",
    "status": "standard",
    "accepted": 2464
kaiiam commented 5 years ago

@cmungall that sounds like an interesting way to proceed. I'm not sure the extent to which we have ENVO terms to map to such marineregions terms. For example I wasn't able to find OBO equivalents for some of their higher level categories such as Longhurst Province I suspect we would have to make new ENVO or GAZ terms.

LennertSchepers commented 3 years ago

Hi all, we are from the MarineRegions team and are currently working on providing our MarineRegions gazetteer as linked open data. We can try to create a mapping to GAZ but if there was any attempt yet we could maybe collaborate? @brittlnv @marc-portier

kaiiam commented 3 years ago

@cmungall tried to do some mappings a bit ago but I'm not sure how far he got.

@LennertSchepers the gaz files are available from https://github.com/EnvironmentOntology/gaz see the readme with links to the obo and owl versions. If you're unfamiliar with processing/using obo/owl you could use the Robot command line tool's export module to export the gaz obo or owl files to csv/tsv in order to more easily map them. You could run the following example commands. Unfortunately my laptop isn't sufficient to run the following command on the whole obo file (but it works with a subset of it).

wget http://ontologies.berkeleybop.org/gaz.obo

robot export --input gaz.obo \
  --header "ID|LABEL|definition|hasExactSynonym|SubClass Of" \
  --export gaz_export.csv

This will produce a csv file like:

ID LABEL definition hasExactSynonym SubClass Of
GAZ:00000000 Chelan County A county located in the State of Washington. Its county seat is Wenatchee. obo:TEMP#located_in some GAZ:00167307
GAZ:00000001 Eastern Europe Floristic Province obo:TEMP#located_in some GAZ:01000024
GAZ:00000002 Wenatchee A populated place. GAZ:00003138|obo:TEMP#located_in some GAZ:00167654
GAZ:00000003 Wenatchee Valley College A two-year Community College located in Wenatchee, Washington. obo:TEMP#located_in some GAZ:00167654
GAZ:00000004 Alaminos Canyon   obo:TEMP#located_in some GAZ:00002853
GAZ:00000005 South Chamorro Seamount A serpentine mud volcano located in the Mariana forearc. obo:TEMP#located_in some 'Marianas Island Arc'
GAZ:00000006 Liancourt Rocks A group of small islands in the Sea of Japan (East Sea), whose ownership is disputed between Japan and South Korea. South Koreans currently occupy the islands. Dokdo {language: Korean} obo:TEMP#located_in some GAZ:00007658|obo:TEMP#located_in some GAZ:00027919
GAZ:00000007 Great Rift Valley A rift valley. A geographical and geological feature, approximately 6,000 km in length, that runs from northern Syria in Southwest Asia to central Mozambique in East Africa. Gregory Rift {alternative name} obo:TEMP#located_in some GAZ:00001105|obo:TEMP#partial_overlaps some GAZ:00000567|obo:TEMP#partial_overlaps some GAZ:00000581|obo:TEMP#partial_overlaps some GAZ:00001090|obo:TEMP#partial_overlaps some GAZ:00001100|obo:TEMP#partial_overlaps some GAZ:00001101|obo:TEMP#partial_overlaps some GAZ:00001102|obo:TEMP#partial_overlaps some GAZ:00001103|obo:TEMP#partial_overlaps some GAZ:00002478|obo:TEMP#partial_overlaps some GAZ:07000046

Hopefully this might give you enough to get started with.

You can try it the owl file as well it might work better, but you might have to modify the --header "ID|LABEL|definition|hasExactSynonym|SubClass Of" \ command to get everything.

If you think this is useful and you're unable to produce a good gaz csv/tsv export let me know and I can try on a cloud instance.

Cheers, Kai