nvkelso / natural-earth-vector

A global, public domain map dataset available at three scales and featuring tightly integrated vector and raster data.
https://www.naturalearthdata.com/
Other
1.8k stars 370 forks source link

Exploring the addition of 'Ocean Basin' regions #697

Open jbusecke opened 2 years ago

jbusecke commented 2 years ago

Hi everyone, a big thanks for maintaining this very useful package.

I wanted to raise an issue about the possibility of creating a new vector set for large scale ocean basins.

Rationale

I am an ocean based climate scientist and for many general ocean science applications it is paramount to segment a global dataset into large Ocean basins. Having a 'central' source of truth for the extent of these basins would enable much 1) easier and 2) more reproducible results.

Past efforts

A while ago I have implemented functionality to achieve this task in cmip6_preprocessing, a package that is geared towards processing the widely used CMIP6 climate model intercomparison project (check out the docs for an example). The functionality relies on the combination of existing geography_marine_polys data and subsequent masking using the regionmask package.

Suggestions

I have since noticed that this functionality could help in much broader contexts, beyond this specific dataset. In fact the current code can readily be used to mask out observational or satellite data. I think it is thus appropriate to move some of this functionality upstream. Ideally I would like to have a geography_ocean_basin_polys or similar in this package.

Do folks here think that this is within the scope of this package?

Many thanks

cc @mathause

hmkhatri commented 2 years ago

Following up on suggestion by @jbusecke, it would be useful to have large ocean basin regions. I have found that combining smaller existing regions to make larger ocean basins does not always work. There are small oceanic regions that are not covered in the existing vector set (see issue) and some manual tweaking is required to create appropriate mask. Having a vector set of large ocean basins would really help.

nvkelso commented 2 years ago

Cool idea! The existing ocean basins are quite old and could use an update.

Do you have shapefiles to contribute?

jbusecke commented 2 years ago

I do not, but the approach of merging the marine regions has been very successful for me so far. Could we go that route either a) as an automatic step or b) Create the 'fused' shapefiles once and maintain them separately?

nvkelso commented 2 years ago

@jbusecke I can see adding the region groupings you have in https://github.com/jbusecke/cmip6_preprocessing/blob/209041a965984c2dc283dd98188def1dea4c17b3/cmip6_preprocessing/regionmask.py#L7-L136 as a new column in Natural Earth, and either you merge based on that on your side, or Natural Earth also makes available the larger region groupings as separate download – though there is also a case for the 7 oceans (Atlantic Ocean versus North and South Atlantic Ocean). Merging by a maintained property or switching to ID based merging would be more future proof, for sure.

Are there any other areas besides this one (below, between Iceland and Greenland) as "missing"? What should it be called?

image

Are there any other sub basins that should be delineated that are in common scientific usage?

hmkhatri commented 2 years ago

@nvkelso I have not looked at other regions yet. I will check the global mask to see if there are any missing areas.

jbusecke commented 2 years ago

Thanks for the answer @nvkelso. Could you explain this a bit more: Merging by a maintained property or switching to ID based merging would be more future proof, for sure. I am not very familiar with shapefiles in general and might be missing something obvious.

jbusecke commented 2 years ago

Hi @nvkelso I just wanted to ping this issue again. If I can get some clarification on the question above, I hope I could get to work on this soon. Thanks again.

nvkelso commented 2 years ago

@jbusecke @hmkhatri from the update in #735 From 204b50046f908df927a31aa52b6d3e32f79b9709:

        - Over 175 row changes to polygon geometry and attributes.
        - Restored or remastered missing features, including: Bab el Mandeb, Baia d
          Maputo, Bering Strait, Denmark Strait, Drake Passage, Great Barrier
          Reef (again), Karskiye Strait, Korea Strait, Luzon Strait, Makassar Strait,
          Puget Sound (again), Ross Sea (again), Scotia Sea, Sea of Japan (again),
          St. Helena Bay, Strait of Dover, Strait of Florida (again), Strait of
          Georgia (again), Strait of Gibraltar, Strait of Hormuz, Strait of Juan
          de Fuca (again), and Yucatan Channel.
        - Added new features primarily in Europe, South East Asia, and Southern Ocean,
          including Aru Sea, Bay of Kiel, Bothnia Bay, Bothnian Sea, Celtic Sea,
          Ceram Sea, Cooperation Sea, Cosmonauts Sea, D'Urville Sea, Fehmarnbelt,
          Iceland Sea, Lazarev Sea, Liaodong Wan, Malacca Strait, Mawson Sea, Natuna
          Sea, Riiser-Larsen Sea, Selat Sumba, Selat Sunda, Somov Sea, Sound Sea,
          Storebælt, Strait of Sicily, Teluk Berau, and Teluk Bone.
        - Mediterranean Sea was reworked back to two separate polygons, one each
          for east and west basins, with separate ne_id and min_label values.
        - Added oceanbasin and subbasin columns.
        - See also: https://legacy.iho.int/mtg_docs/com_wg/S-23WG/S-23WG_Misc/Draft_2002/Draft_2002.htm

Public beta in:

nvkelso commented 2 years ago

Tips:

Ocean basins:

mapshaper -i 10m_physical/ne_10m_geography_marine_polys.shp -dissolve oceanbasin -o 10m_physical/ne_10m_oceanbasin.shp

Sub-basins with ocean basins (eg with bays and gulfs merged into their parent water body):

mapshaper -i 10m_physical/ne_10m_geography_marine_polys.shp -dissolve subbasin copy-fields=oceanbasin -o 10m_physical/ne_10m_subbasin.shp

NOTE: You'd need to do some remapping magic inline in Mapshaper to dissolve the Northern and Southern portions of Atlantic and Pacific into single Oceans.

jbusecke commented 2 years ago

Thank you so much for the tips and your patience @nvkelso. Just giving that a go now.

NOTE: You'd need to do some remapping magic inline in Mapshaper to dissolve the Northern and Southern portions of Atlantic and Pacific into single Oceans.

I honestly think that is trivial enough for the user to split, so having one region for the Atlantic/Pacific/etc is already a bit advance in my book.

Is there any advice in terms of making a PR to avoid noob pitfalls with a repo as large as this? My current idea is to create the large ocean basins and add them as 10m_physical/ne_10m_geography_ocean_basins.*. Should I document somehow (in the PR maybe how these were generated, or is there a way to formalize that in some sort of CI step?).