anki-geo / ultimate-geography

Geography flashcard deck for Anki
https://ankiweb.net/shared/info/2109889812
Other
800 stars 79 forks source link

Inclusion rules for seas and related water bodies #346

Closed axelboc closed 3 years ago

axelboc commented 4 years ago

As agreed in https://github.com/axelboc/anki-ultimate-geography/issues/137#issuecomment-620764010, and since discussion has started again on this topic in #137, I'm opening this issue to discuss inclusion rules for water bodies (oceans, seas, gulfs, straits, etc.) I'm leaving lakes and rivers out for now (see why down below).

Like for political entities (#306 + #312), we need to identify which types of water bodies to include, and then, for each type of entities:

  1. find a source article on Wikipedia that lists them as exhaustively as possible;
  2. come up with precise, practical inclusion rules to decide which ones and how many to include (with maps only this time -- no capitals and flags to worry about... 😄);
  3. document the source article and inclusion rules in CONTRIBUTING.md.

What to focus on for now

Oceans

Like for continents, this one's easy: https://en.wikipedia.org/wiki/Ocean#Oceanic_divisions. Since we're not a historical deck, there's no debate that the world has five oceans and all five of them are to be included ... which they already are! They even have brand new maps #325! ✨

Marginal seas

Wikipedia's list of seas seems like a good reference. The problem is that it lists gulfs, bays, straits, channels, etc. within the same Marginal seas heading, without further categorisation. Apparently the terms sea, gulf, bay, sound, etc. are used inconsistently (as you've rightly pointed out in https://github.com/axelboc/anki-ultimate-geography/issues/137#issuecomment-650528659, @aplaice), so I don't think we'll be able to list them independently and apply different criteria to them.

The good news is that the list already excludes lakes with "Seas" in their names, like the Sea of Galilee or the Aral Sea. Also, we might be able to separate out straits, channels, passages, and so on. If that's indeed the case, then I think using the surface area as the sole inclusion criteria for the remaining seas, gulfs, etc. would work pretty well.

Straits, channels, passages, etc.

Same list as above. As @aplaice mentioned, straits between continents would be the easy pick. Also, since I can only see two channels, the English Channel and the Mozambique Channel, I reckon we can include them as well. The remaining bodies are more debatable.

What to ignore for now

Lakes

I'd be inclined to leave these alone for now, for the following reasons:

Rivers, deltas, fjords, etc.

Since we don't have any in the deck yet, let's leave them out of the scope of this issue.

The-Wap commented 3 years ago

Sort of modified repost of https://github.com/axelboc/anki-ultimate-geography/issues/137#issuecomment-632699020 as I didn't get any feedback there: Regarding the bodies of water, I am currently using the "Rivers, Lakes, Seas, and Oceans"-Deck mentioned https://github.com/axelboc/anki-ultimate-geography/issues/137#issuecomment-533351257. I deleted the cards that were already present in the UG deck, translated the rest into German and also modified it to match the UG cards style (although I also changed the UG design a bit to e.g. get the corresponding Wiki articles embedded in an iFrame). Most of the data in the deck is on your "ignored for now" list (e.g. biggest lakes and longest rivers), though. The mentioned "Rivers, Lakes, Seas, and Oceans"-Deck has really good maps of the water bodies. The deck itself is an export of the "Rivers, Lakes and Seas" deck by azrael42 on Memrise. From the description of the Anki deck: "Most of the location maps were created by azrael42 and are based on the work of the following Wikipedia Commons users: Africa: Sting (Eric Gaba), Middle East, Central America and Australia: Виктор В, Europe: Alexrk2, New Zealand, Sulawesi: NordNordWest, Japan: Chumwa, all remaining maps (!): de.wikipedia.org/wiki/Benutzer:Uwe_Dedering Please see here for more information on the original deck." Maybe there is a chance to "just include" this deck into UG, with the appropriate mentionings (I have no clue about the license stuff)? If this is the case and you, at some point in time, plan to include rivers and lakes into the UG deck, I can provide you with my translated and slightly modified privately used Deck (.apkg), to not do the translation works twice (at least into German). I can extract a double language deck (with both English and German names included). Happy to get some feedback on that.

axelboc commented 3 years ago

Thanks @Sir-Casm, we'll keep an eye on the "Rivers, Lakes and Seas" deck. Some of the maps look like they match ours pretty closely, so it might be a good source for us. As mentioned, though, our priority is to first define inclusion criteria for seas, straits, etc. Once we'll have done that, we'll see which notes need to be added to the deck and we'll work on sourcing or creating maps for them. Thanks for bumping the issue, I admit that I had forgotten about it. 😅 I'll try to get things moving asap.

axelboc commented 3 years ago

Alright, let's get the ball rolling!

physical-entities-v1.xlsx

Observations

aplaice commented 3 years ago

Wow, that's a huge amount of work!

I think it broadly makes sense. To the extent that I have doubts it's because of how arbitrarily named and ambiguously defined most water bodies are, and I'm not sure if any better choices are possible, given the data.

Straits

  • I've listed every strait, channel and passage listed on the List of seas, but I haven't picked inclusion criteria for them yet.

Alternatively, we could use the (category) list of International Straits, since it already includes the Straits of Gibraltar and the Bering Strait, and already constrains on one of the criteria that might be interesting (the strait being international (though that criterion definitely isn't sufficient)). Slightly contrary to the name, it also includes "channels" such as the English Channel, the Mozambique Channel or the Drake Passage (under Antarctica), claiming that they're actually straits. The list is far more extensive, though, and we probably aren't interested in most of the items...

As yet another alternative, a wikidata query that lists all the "straits" (according to Wikidata, the English Channel etc. are also straits) that "belong" to more than one country, could be used. I'll try to write one (for comparison, even if we don't end up using it).

I'm a bit hesitant to just edit the List of Seas, to add in the Bering Strait and the Straits of Gibraltar, since it doesn't seem to be too reliable a source in this regard. If it missed even them, it might also have missed many other straits, some of which it might make sense to add, but which we just wouldn't think of, ourselves. OTOH we shouldn't let the perfect be the enemy of the good, so if none of the alternatives turns out unfeasible/impractical, just editing the List to add in the straits we want isn't too bad.

Regarding the inclusion criteria, I'm a bit stumped. Arguably, the narrowness (and hence — usually — small area) of straits makes them more notable, since it increases their geopolitical importance. I can't think of anything other than my (old) suggestion of including intercontinental straits, but that's also not ideal (e.g. would we add both the Bosporus and the Dardanelles?). The extensive straits (such as the English or Mozambique channels) could be included based on the same surface area criterion as all the other seas.

Observations

Rather than removing this minimum requirement, it may be wiser to edit the articles to improve their infoboxes.

Yes, that makes sense. (Assuming that a reliable source is found (see below). mapshaper allows obtaining the area of the features it plots, so I could get the area from our maps, but that goes against Wikipedia's "no original research"...)

I assume you mean the seas of the Southern Ocean? The very largest are probably worth including, since they really are quite huge, even if they're very far from any populated territories.

  • Some areas given by Wikipedia don't seem right -- those near the inclusion criterion may need to be reviewed:

Yes, it seems that they're going by the Australian Hydrographic Service's definition.

  • the Cook Inlet is said to have an area of 100,000 km2, which seems very approximative, since it looks much smaller than, say, the White Say, which Wikipedia says has an area of 90,000 km2.

Yes, Wikipedia's data is dubious.

I wonder what Indonesians think of all the many Mediterranean seas. :D

On that note, the Ionian and Balearic Seas are also missing infoboxes. :)

Additional comments

My main issue is that I have very serious doubts about Wikipedia's listed surface areas. They don't seem to provide sources and they don't specify which (of the often many) definitions they're following.

OTOH assuming that they're approximately correct and that we verify the edge-cases, it probably doesn't matter.

axelboc commented 3 years ago

I can't think of anything other than my (old) suggestion of including intercontinental straits, but that's also not ideal (e.g. would we add both the Bosporus and the Dardanelles?). The extensive straits (such as the English or Mozambique channels) could be included based on the same surface area criterion as all the other seas.

It's a good option. Another idea could be to include only straits that connect two marginal seas (and/or oceans) that both pass the inclusion criterion I suggested of 100,000 km2. This would exclude the Bosporus and the Dardanelles since they don't connect the Black Sea and the Mediterranean Sea directly.

Note that the English Channel doesn't pass the 100,000 km2 limit, which is why I didn't include it as a marginal sea. I think we'll need a different criterion for channels.

Alternatively, we could use the (category) list of International Straits.

You're totally right, it includes a lot more straits, channels and passages. 💯

I'm a bit hesitant to just edit the List of Seas, to add in the Bering Strait and the Straits of Gibraltar, since it doesn't seem to be too reliable a source in this regard. If it missed even them, it might also have missed many other straits, some of which it might make sense to add, but which we just wouldn't think of, ourselves.

Yeah, it's not a good sign at all. It makes me wonder whether this List of Seas is exhaustive even for larger seas. I'd be interested in comparing it with the IHO's Limits of Oceans and Seas of 1953, also to check whether the list (and our sublist) includes seas not delimited by the IHO.

My main issue is that I have very serious doubts about Wikipedia's listed surface areas. They don't seem to provide sources and they don't specify which (of the often many) definitions they're following.

Yeah, the lack of sources is quite appalling. The fact that some of the numbers seem really approximative and inaccurate doesn't help, that's for sure. 😞 It also is a shame that quite a few large seas are missing area information altogether.

mapshaper allows obtaining the area of the features it plots, so I could get the area from our maps, but that goes against Wikipedia's "no original research"...)

If our own "original research" is more exhaustive, precise and reliable than Wikipedia, I don't see a problem with it. As long as we're methodical about it and the research is well documented and fully reproducible, I don't think anybody would mind.

I've just found a paper introducing a digital map of the limits of oceans and seas based on the IHO's Limits of Oceans and Seas. Perhaps you had come across it before? I also found a site to download the map's shapefile, which we could totally pass through mapshaper to compute all the areas we need. What do you think?

axelboc commented 3 years ago

Or perhaps we may as well use the 2002 draft of the IHO's Limits of Oceans and Seas. After all, we already include the Southern Ocean, which does not exist in the 1953 version. 🤷‍♂️ I think we've talked about this before, but we could definitely highlight contentious areas between the 1953 and 2002 versions (like the limits of the East China Sea), a bit like we do for countries.

aplaice commented 3 years ago

Yeah, the lack of sources is quite appalling. The fact that some of the numbers seem really approximative and inaccurate doesn't help, that's for sure. disappointed It also is a shame that quite a few large seas are missing area information altogether.

I've just spent some time looking at the Sea of Japan and the East China Sea, and part of the problem is that their infobox templates are not for water bodies, but for East Asian (or Chinese) items of interest. (I think that this can be easily remedied, since (some) infoboxes can AFAICT be nested/embedded, though following the guidelines didn't seem to do anything, at least when previewing changes — it's possible that template changes don't apply when only previewing, that these particular infobox templates don't support embedding or that I did something wrong — I'll play around in a Wikipedia sandbox to check.)

The Sea of Japan and the East China Sea do both contain surface areas in Wikidata, but while adding/cross-referencing (respectively) the sources for them, it turned out that various sources provide areas varying from 978,000 km² to 1,048,950 km² (for the Sea of Japan) and from 750,000 km² to 1,249,000 km² (for the East China Sea). Some of the sources even noted that multiple authorities provide different values...

I've just found a paper introducing a digital map of the limits of oceans and seas based on the IHO's Limits of Oceans and Seas. Perhaps you had come across it before? I also found a site to download the map's shapefile, which we could totally pass through mapshaper to compute all the areas we need. What do you think?

That looks great! I might have seen it before and rejected it due to the non-commercial license for the shapefiles (which would make the SVG derived from the data incompatible with the CC BY-SA or similar needed for Wikimedia), though I don't remember for sure. In any case, for calculating the areas it's perfect. I'll definitely have a look! (I'm not making any promises on when I'll do everything that I've promised to do, in this thread, though I'll try soon-ish:))

It's a good option. Another idea could be to include only straits that connect two marginal seas (and/or oceans) that both pass the inclusion criterion I suggested of 100,000 km2. This would exclude the Bosporus and the Dardanelles since they don't connect the Black Sea and the Mediterranean Sea directly.

That's a great and simple solution!

Note that the English Channel doesn't pass the 100,000 km2 limit, which is why I didn't include it as a marginal sea. I think we'll need a different criterion for channels.

I hadn't realised. :O Your suggested criterion of connecting two sufficiently large seas would work in this case, though — both the Celtic Sea and the North Sea have an area greater than 100,000 km².


Or perhaps we may as well use the 2002 draft of the IHO's Limits of Oceans and Seas. After all, we already include the Southern Ocean, which does not exist in the 1953 version. man_shrugging I think we've talked about this before, but we could definitely highlight contentious areas between the 1953 and 2002 versions (like the limits of the East China Sea), a bit like we do for countries.

It's a really neat idea and it'd look great, but I'm not sure whether there are sufficiently good shapefiles for the 2002 version, though. (I had manually input the data from the 2002 draft for the Bering Strait map, with QGIS, and it took a while. Experience would speed things up considerably, but I still wouldn't relish doing it for all of the seas.) There aren't really sufficiently good, appropriately licensed shapefiles even for the published 1953 version — I had to patch the mostly great Natural Earth data with alternative sources, in a couple of cases, and the Baltic Sea is still subtly wrong (it's been on my to-do-list for a while).

axelboc commented 3 years ago

The good news is that a lot of the boundaries didn't change between the 1953 and 2002 versions, so I think the 1953 shapefile should cover a good 80% of our need.

axelboc commented 3 years ago

That looks great! I might have seen it before and rejected it due to the non-commercial license for the shapefiles (which would make the SVG derived from the data incompatible with the CC BY-SA or similar needed for Wikimedia), though I don't remember for sure. In any case, for calculating the areas it's perfect.

From what I gather, the shapefile is available for non-commercial use, so we can reference it in this repo, if not on Wikimedia. 👍

axelboc commented 3 years ago

Since Wikipedia's List of seas is so unreliable, here is the list of all the seas (including straits) described in the IHO's 2002 draft:

IHO seas.xlsx

The draft often uses the seas' local names, so I've normalized them all to English based on Wikipedia. Only very few seas don't have articles on Wikipedia (that I could find): Aru Sea (relatively large sea off the coast of Papua New Guinea), Central Baltic Sea (portion of the Baltic Sea), Sound Sea (tiny sea off the coast of Estonia), Tryoshnikova Gulf (tiny gulf off the coast of Antarctica). Also, Wikipedia has an article for the Northwest Passage (i.e. the sea route in Northern Canada), but not for what the IHO calls the Northwestern Passages (i.e. the combination of all the waterways in the region).

Regardless, I think this is a much better list than what I had before. It looks a lot more complete, especially when it comes to straits. If you agree, @aplaice, I'll update physical-entities with this list, separating straits from seas, and we can go from here.

I've cross-checked the list with the IHO's 1953 document and identified the seas that appear in both versions. The next step will be for me to check which seas have changed boundaries between the 1953 and 2002 versions. This will inform us on which areas need to be calculated from scratch, and which can be extracted from the shapefile I mentioned previously.

Note that the 1953 version includes one sea that is absent from the 2002 version: the Sea of Japan. This is due to the naming dispute between Japan and Korea. I think it should still be included in the list, though.

Note also that a number of seas from Wikipedia's List of seas are absent from the IHO's document. The most significant ones are the Cook Inlet (the area of which Wikipedia most likely overestimates), the Argentine Sea (which lacks international recognition), and the Levantine Sea (which some sources characterise as a lake). I don't think any of these are worth keeping in physical-entities.

aplaice commented 3 years ago

Regardless, I think this is a much better list than what I had before. It looks a lot more complete, especially when it comes to straits. If you agree, @aplaice, I'll update physical-entities with this list, separating straits from seas, and we can go from here.

I think that it makes sense to replace Wikipedia's List of seas with the 2002 IHO draft. I'm far less confident about replacing the previous criterion of "existence of inbobox" + "area in infobox > 100,000 km²", with the criterion of our own calculated (based on the 2002 draft) area being greater than 100,000 km² or a wholescale editing of Wikipedia's infoboxes.

Firstly, updating the shapefile will take a considerable amount of work (to do semi-properly), (assuming that the comparison document is correct, I'd assume approx. half a day) for something that will be of limited general use — due to the non-commercial license of the original source it won't be useful for anything Wikipedia-related, and since it'll be done only semi-properly, it won't be of much use in an academic environment. (Doing it actually properly in the way the source 1953-based article did would be an immense undertaking.) (Though an automated parsing of the text of the IHO 2002 draft might be feasible (as you've probably noticed, I like automated solutions, since they don't leave me second-guessing whether I didn't make a mistake :).)

Secondly, the 2002 draft, due to its very nature, isn't really an authoritative, indisputable source, so I'd be uncomfortable editing Wikipedia to replace/provide an area calculated based on it. (In cases like the extent of the South China Sea, it even gets a bit political...)

Thirdly, for purely our purposes (since it wouldn't be easily/cleanly "upstreamable"), I feel like it's a huge overkill. For the edge cases, it makes sense to calculate the areas, to double-check Wikipedia's dubious values, but otherwise even large errors in the values are unlikely to change whether a sea passes the criterion.

I feel really bad discouraging/opposing your enthusiasm. :(

the Sea of Japan [...] I think it should still be included in the list, though.

Yes, definitely.

Note also that a number of seas from Wikipedia's List of seas are absent from the IHO's document. I don't think any of these are worth keeping in physical-entities.

Yes, I think that it's safe to exclude them.

axelboc commented 3 years ago

Oh, sorry, I wasn't suggesting creating a full-on shapefile with the 2002 data. Here is the process I have in mind:

  1. List the seas defined by the IHO in 1953 and 2002, and use the 2002 seas as our source list.
  2. Get the areas of the 1953 seas from the shapefile.
  3. Do a rough visual comparison of the outlines of the seas in the 1953 shapefile with the outlines of the seas on the maps of the 2002 draft. Indicate whether each sea's area has increased/decreased significantly in proportion to its size.
  4. Measure very roughly in QGIS all the seas that were added in the 2002 draft.
  5. Each sea should now have an area, whether precise or not. Apply the threshold of 100,000 km2 to the list and confirm that it's the one we want to use.
  6. Identify the seas that have, or may have, an area close to the 100,000 km2 threshold, taking into consideration the observations from step 3. Measure each of these seas more accurately on QGIS.
  7. For areas that remain very close to the threshold, compare them with areas provided on Wikipedia.
  8. Review the final inclusion list and discuss any contentious cases.

I'm currently half-way through step 3. Do you think the process makes sense? Am I going in the right direction?

aplaice commented 3 years ago

Yes, the process makes sense! Sorry for the confusion!

Regarding 2: it turns out to be even easier than expected. The areas can be trivially calculated and neatly outputted with

mapshaper  HRmLOS_1.1.shp -each 'area=this.originalArea' -o areas.csv

but it turns out that even this is not needed, since in this particular case, the shapefiles already contained the relevant areas (though for some reason, all the areas are out by a factor of 1.002258 compared to those calculated by mapshaper — I'm not yet sure which is correct, but I'll investigate by working with some other shapefiles with known areas etc.).

All that remains is consolidating the areas in the cases where a sea corresponds to more than one unit (e.g. the Mediterranean or the Baltic). I'll upload the data tomorrow.

Measure very roughly in QGIS all the seas that were added in the 2002 draft.

As a second (additional) sanity check, the areas of the seas as obtained from Natural Earth, could also be calculated. (In some cases, NE had subdivisions of seas that were not in the 1953 version.)

axelboc commented 3 years ago

Regarding 2: it turns out to be even easier than expected. The areas can be trivially calculated and neatly outputted with

Indeed, I created a calculated field with the area (rounded to the nearest integer) in QGIS, so I already have all the 1953 areas. I've also combined the areas of the Mediterranean and Baltic seas based on their 1953 boundaries (which are the same, overall, as the 2002 boundaries).

I'm now half-way into comparing the boundaries visually between the shapefile and the 2002 draft (which has maps with outlines for every sea, unlike the 1953 document... 😄 which should make creating the maps for the included seas much easier, by the way).

aplaice commented 3 years ago

Indeed, I created a calculated field with the area (rounded to the nearest integer) in QGIS, so I already have all the 1953 areas. I've also combined the areas of the Mediterranean and Baltic seas based on their 1953 boundaries (which are the same, overall, as the 2002 boundaries).

Great! :)

which should make creating the maps for the included seas much easier, by the way).

Yeah, I had used the maps from the 2002 draft (as multipe .docs , from the IHO website — the single PDF you found, is much nicer), as an additional, if partial, sanity-check when making the maps.

As yet another alternative, a wikidata query that lists all the "straits" (according to Wikidata, the English Channel etc. are also straits) that "belong" to more than one country, could be used. I'll try to write one (for comparison, even if we don't end up using it).

Now, done here.

Query also here, in case GitHub mangles the long link ```sql SELECT DISTINCT ?strait ?straitLabel ?noCountries ?article WHERE { { SELECT ?strait ?article (COUNT(DISTINCT ?country) AS ?noCountries) WHERE { ?strait wdt:P31 wd:Q37901; wdt:P17 ?country. ?article schema:about ?strait; schema:isPartOf . } GROUP BY ?strait ?article } FILTER(?noCountries > 1 ) SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } } ORDER BY (?straitLabel) ```

straits_wikidata.xlsx

I still can't think of any sensible criteria to narrow down the list from the 67 straits, though. (I'm currently annotating the straits by the seas it joins — using the criterion of seas > 100,000 km² will narrow things down considerably, but I'm not sure if by enough.)

aplaice commented 3 years ago

I've applied the criterion of connecting two water bodies each having an area > 100,000 km² to the list of "International" Straits (those bordering more than one country), from Wikidata, and it results in 18 matches:

straits_wikidata.xlsx

The areas are taken from your physical-entities-v1.xlsx, so a couple sea areas are still missing (due to lack of Wikipedia infobox), and hence some more straits might fit the criteria. (I gave the oceans (and parts of oceans) an arbitrary, large area of 999999 km², since they obviously pass.)

All of the matching straits are "interesting", at least for me, but I feel that there are too many of them...

Slightly arbitrarily, I'd cut down on some of the many straits between the Caribbean Sea and Atlantic Ocean, replace the Beagle Channel with the Drake Passage or the Straits of Magellan and remove the Strait of Bonifacio, as well as perhaps the Balabac Strait.

axelboc commented 3 years ago

Awesome! Yeah, it feels like a lot, but maybe it's because the list contains some obscure entries. I feel like 12~15 would be a good target range.

Looking at your list and at the missing straits, I wonder whether the "more than one country" rule isn't a bit too arbitrary. Also, I'm concerned as to the recognition status and international relevance of some of the straits and channels returned by Wikidata.

What do you think about the straits, channels and passages listed in the IHO draft of 2002? I count 24 of them (listed below). I feel like it might be a more relevant starting list. That being said, it doesn't include the Strait of Magellan, which I had completely forgotten about. It's probably because it's not in international waters. 😞

What I like about this list is that:


Back to the problem of the criteria, One significant problem with the "connecting seas" criterion, as you've noticed, is granularity. In many cases, a strait may connect a sea that is part of a larger sea. How do we decide which of the two seas to take into account?

Perhaps we should start again with your initial idea of including straits that connect two oceans, criterion which we could very well extend to continental plates. How many straits in your list and out of the 24 defined by the IHO would this apply to? 🤔

In parallel, we could apply a simple area criteria in order to include the biggest channels and passages, like the Northwestern Passages, the Drake Passage, the Mozambique Channel and the English Channel.

A few interesting straits would probably still be left out. Assuming the data exists, we might be able to pick, out of the remaining straits, the 5 or so that get the most maritime traffic ... or something in this vein. 🤷‍♂️


Here is where I'm at, for info: IHO seas.xlsx. I've collated the areas of all the 1953 seas, compared them visually with the 2002 maps, and identified whether they had significantly/insignificantly increased/decreased.

axelboc commented 3 years ago

Ooh I just learnt about the maritime law concept of Transit passage. The article lists 5 straits as being covered by the transit passage provisions: the Strait of Gibraltar, Dover Strait, Strait of Hormuz, Bab-el-Mandeb and Strait of Malacca. It also mentions that the Danish Straits, the Turkish Straits and the Strait of Magellan are not covered by the provisions because they are already governed by "international conventions".

Maybe we could use this article to complement the IHO list, which would amount to adding the Turkish Straits, Bab-el-Mandeb and the Strait of Magellan?

EDIT: this article also explains the legal concept of international waterways in the context of straits.

aplaice commented 3 years ago

Looking at your list and at the missing straits, I wonder whether the "more than one country" rule isn't a bit too arbitrary.

Yeah, it is a bit arbitrary.

What do you think about the straits, channels and passages listed in the IHO draft of 2002? I count 24 of them (listed below). I feel like it might be a more relevant starting list.

Yeah, I think you might be right!


I've now added the information about whether the strait joins two oceans or separates two continents (or continental plates), for both the Wikidata straits and the IHO+transit passage ones.

However, the boundary between the Pacific and Indian Oceans is pretty much undefined, since the 2002 draft doesn't state to which the East China and Archipelagic Seas belong. The "best" that I could find was the borders of the oceans from the CIA factbook maps used on Wikipedia...

straits_wikidata.xlsx

I think that OR(connects oceans, separates continents, is transit passage) for the IHO+transit passage straits gives a relatively sensible result, and a relatively manageable number of 14. It excludes Denmark Strait, but AFAICT it's not really that important, interesting tidbit that it's between Europe and America, aside. (However, the fact that it adds the "Danish straits" instead, might confuse some people!) The other logical combinations would also work, though.

I'm slightly worried that we're getting inured to straits, and even 14 (or 10) will be too much...


Here is where I'm at, for info: IHO seas.xlsx. I've collated the areas of all the 1953 seas, compared them visually with the 2002 maps, and identified whether they had significantly/insignificantly increased/decreased.

Wow! It looks great!

Ooh I just learnt about the maritime law concept of Transit passage.

That's pretty cool! (As noted above, I've included this in the spreadsheet.)

axelboc commented 3 years ago

I've tried to find what the busiest straits might be, but I couldn't find a simple list. All I could find were maps of shipping routes. The busiest straits are somewhat identifiable: Taiwan, Malacca, Bab-el-Mandeb, Sicilly, Bosphorus/Turkish, Gibraltar, English Channel, Dover, Danish. This is far from ideal, though:

I've also looked at oil choke points, which seem interesting from a geo-political standpoint: Hormuz, Malacca, Bab-el-Mandeb, Danish, Turkish. Not very useful, though since they're all included in the transit passage list and the latter also has Magellan, Dover and Gibraltar.


The combination I like the most so far is OR(connects oceans, is transit passage), which gives 13 straits (or 14 with the Sunda Strait). I think it's good that the Denmark Strait is not included. That being said:

A lot of very subjective choices and feelings in all of this... but we're still brainstorming 😅

aplaice commented 3 years ago

I've tried to find what the busiest straits might be, but I couldn't find a simple list. [...]

Yeah, it's quite interesting, but I don't see a straightforward way of extracting something useful from this.


If I had to pick one, I would prefer to have the English Channel over the Strait of Dover.

Yes, definitely.

I don't feel strongly about Bass Strait and Drake Passage being included at all... maybe even the Northwestern Passages,

Yeah.

the Strait of Sicily, which I don't find very interesting.

TBH I'd also prefer if it weren't included.

I'm not sure how I feel about the Danish and Turkish straits.

I'd lean towards preferring them excluded. Their constituent parts could definitely be included, in the Country info, but that would, in effect, make the relevant cards about the Country info, since the "main answer" is rather uninteresting.

The exclusion of "multiple" straits would also have the mild benefit of excluding the Northwestern Passages.


I should probably take a brief break from this, since I've now added three more fields (is in IHO 2002, is "international" (bordering more than one country) and is a single strait), in the interest of finding a combination that would justify my preferences, which is bordering on the slightly crazy...

straits_wikidata.xlsx

If I were to make a decision now, I'd probably vote for ((connects oceans) OR (is transit passage)) AND (single strait), but I'm not quite happy about it.

axelboc commented 3 years ago

IHO seas.xlsx

I've finished measuring the 2002 areas that needed to be measured. I used the area measuring tool in QGIS, along with a variety of equal-area projections. I think the result is sufficiently precise for our purpose.

I've experimented with various area thresholds, and the results give quite high inclusion numbers. I've tried adding a second criterion to remove most of the Antarctic seas, which are quite numerous and, in my opinion, not very interesting. (I also had trouble measuring some of them because of significant changes in the coastline of Antarctica since 1953.)

Here are the results of my experiments:

Criteria Pass In deck Add Remove
>= 100,000 km2 77 26 51 1
>= 150,000 km2 69 25 44 2
>= 100,000 km2 OR >= 500,000 km2 if Antarctic 68 26 42 1
>= 175,000 km2 66 25 41 2
>= 200,000 km2 62 22 40 5
>= 150,000 km2 OR >= 500,000 km2 if Antarctic 61 25 36 2
>= 175,000 km2 OR >= 500,000 km2 if Antarctic 60 25 35 2
>= 200,000 km2 OR >= 500,000 km2 if Antarctic 56 22 34 5

In my opinion, the 100,000 km2 threshold is way too inclusive, and the 200,000 km2 threshold removes too many seas from the deck (2 that get excluded in most cases: White Sea and Adriatic Sea; plus 3 more: Aegean Sea, Gulf of California, Bay of Biscay).

I've highlighted the criteria that I think are the best. My preference goes to >= 175,000 km2 OR >= 500,000 km2 if Antarctic.

aplaice commented 3 years ago

This is amazing (as expected :))!

Two things that slightly bother me, but I don't see any great solutions:

  1. That's quite a large number of new seas/notes (an over 100% increase).

  2. Some seas that are being excluded, (subjectively) feel more interesting than some that are being included. For instance, I feel that the Adriatic Sea is slightly more note-worthy than the Ionian or the Tyrrhenian Sea, and that the White Sea is slightly more interesting than, say, the Laptev Sea.


Perhaps we could introduce the same higher threshold for the subdivisions of the Arctic Ocean that you suggested for the Antartic (Southern Ocean) seas? At 500,000 km² it would AFAICT exclude Chukchi Sea and Iceland Sea.

I'd consider lowering the non-polar seas threshold to 125,000. Compared to 175,000 it'd include the Adriatic Sea (which I'd miss), the Gulf of Tonkin (which is vaguely historically important) and the Seram Sea (which I don't have a strong opinion about, either way). However, I'm not sure if it doesn't bloat the deck too much, so 175,000 might indeed be better (it's definitely better than 100,000, 150,000 or 200,000).

axelboc commented 3 years ago

IHO seas.xlsx

Here is another idea: instead of applying the 500,000 km2 threshold to polar seas, we apply it to seas introduced in the IHO's 2002 draft. Seas that were already defined in 1953 can then be applied the lower threshold of 125,000 km2.

image

aplaice commented 3 years ago

That seems more-or-less perfect!

axelboc commented 3 years ago

physical-entities.xlsx

I've had another look at the straits and I may have found a decent set of criteria:

is transit OR (area <= 50,000 AND >= 2 states)

The source list I've used combines the straits of the IHO draft of 2002 with the transit passage straits, but excludes the "collective straits" (Turkish straits and Danish straits). Here is the result:

image

image

The "to be removed" count includes Denmark Strait even though it's not part of the source list. The other strait to be removed is the English Channel, which my western bias disagrees with. 😄 The only way I see to keep the English Channel would be to consider "channels" separately from straits and to apply the >= 2 states criterion. This would include the Mozambique Channel while still exclude the Bristol Channel.

aplaice commented 3 years ago

That looks reasonable!

It'd be a shame to exclude the English Channel, while including the Mozambique channel doesn't feel like an issue, so I'd vote for the separate "channel" criterion.