Atomic `Polygon` model - Githubissues

amaury1093 commented 5 years ago

From @amaurymartiny on October 10, 2018 22:27

Proposed by @wtokumaru in #26:

Can we separate the geojson data from the Territory and have it as a separate Polygon (or whatever) Entity that different territories can point to (like how they point to a Nation/PoliticalEntity)? We need this in order to avoid having like 20 Belaruses or Romes or Alsace-Loraines. Even the notion of a territory being something that trades hands or an administrative border/subdividision retained between social changes inherently can not be not limited to a single continuous time range...

This model would have the following fields:

id
name
geometry (geojson field)
references

I personally like this idea. I had in mind to create a new Territory each time a PoliticalEntity changes borders, thinking that some geojson will be duplicate, but it's okay b/c db storage is cheap. But if we want a clean solution, this is definitely the way to go.

We will create a small collection of all known atomic territories, and future mappers can just pick&assemble them to build Territories.

Note: This corresponds exactly to CIDOC's http://www.cidoc-crm.org/Entity/E94-Space-Primitive/Version-6.2.

Problems:

Once we have a small collection of Polygons, how to efficiently search through them?
- We'll have a name field, but it can be duplicate, is not precise etc...
- If the search algorithm is not performant, mappers will create duplicate polygons, and we'll have polygons that are 99% similar.
If I want to create the Territory of France today in 2018, probably I'll find the Polygons of the 101 departments in our collection. So UX-wise, I basically need to search/click on 101 items? (possibly more, if the department is not the most atomic polygon)

Copied from original issue: chronoscio/backend#53

amaury1093 commented 5 years ago

From @dwaxe on October 10, 2018 23:47

The database schema for this would look something like:

class SpacePrimitive(Model):
    name
    geom = GeometryField
    references

class PoliticalEntity(Model):
    name
    ...
    owned_geometries = ManyToManyField(SpacePrimitive, through='OwnershipPeriod')

class OwnershipPeriod(Model):
    start_date
    end_date
    polygon = ForeignKey(SpacePrimitive)
    political_entity = ForeignKey(PoliticalEntity)

amaury1093 commented 5 years ago

From @whirish on October 11, 2018 3:1

I like this idea, my only question would be how practical it would be. In cases where a SpacePrimitive is divided for a peace treaty, we would have to separate it into multiple SpacePrimitives, and then update all existing references to it. I suppose most land acquisitions aren't this arbitrary but I'm sure it's happened and could be a pain on the backend.

owned_geometries = ManyToManyField(GeoFeature, through='OwnershipPeriod')

I assume this is supposed to be ManyToManyField(SpacePrimitive, through='OwnershipPeriod')? Thanks for the reference schema.

amaury1093 commented 5 years ago

From @ataalik on October 11, 2018 11:24

So for this to be more optimized territories like Rome, Belarus etc needs to change hands as a whole without changing their overall shape? Is that the most frequent thing to happen in history? I feel like most of the time territories will be annexed by an other neighboring territory. In that case are we going to create a union of the two shapes? What if not the whole territory gets annexed?

amaury1093 commented 5 years ago

From @wtokumaru on October 14, 2018 0:19

Any territorial change must either use existing borders or create new ones. With sufficiently clever stitching and caching I do not see there being anything that this would make more difficult than what we have currently in any situation.

amaury1093 commented 5 years ago

From @ataalik on October 14, 2018 1:14

Well I guess you might be right. Only overhead work this would create for us making a system for somehow allowing editors to filter and chose from older borders. Conforming to CIDOC is also a plus.

amaury1093 commented 5 years ago

From @wtokumaru on October 14, 2018 2:43

It should also be possible to just submit new borders alongside the set of existing ones that they select. Or to just submit entirely new borders as we currently do, in the same manner as we currently do.

amaury1093 commented 5 years ago

From @whirish on October 14, 2018 21:42

I'm thinking of war occupations: very small portions of ground would be gained and lost by both sides on a daily basis, and an atomic polygon will be created for each. Existing references to larger polygons encompassing these areas that would have been owned by a nation will have to be updated to point to the new subdivided nations, which I suppose is not excessively difficult to implement though it could potentially be a huge database operation. My issue more lies with when mappers will have to create future territories; there will be massive amounts of incredibly small atomic polygons they will have to filter through and select to model a nation.

amaury1093 commented 5 years ago

The question I have is: do we accept overlapping SpacePrimitives?

Intuitively, I'd say no, for data consistency and no duplicates.
But practically I'd say yes.

If no: solution 1 That would be the most ideal solution, and we could ask the data questions like "how many time has Alsace-Lorraine changed hands between 1850-1950?" However, we need to solve:

the war scenario described by @whirish just above. That's a HUGE db write operation that we need to make each time a mapper splits on atomic SpacePrimitive into 2 smaller ones. I really think it's a risky operation (and may happen often), so I'm not in favor of this, unless we can come up with some solution.
- one proposal. Have some sort of hierarchy inside SpacePrimitives. When we split one SpacePrimitive into two, it creates 2 new smaller SpacePrimitives instances, and the 2 children have the same parent_id. We then won't need to have to update any space_primitive_ids inside OwnershipPeriod.
UX. If we have small atomic SpacePrimitives, the mappers need to select thousands of those to create a political entity.
- proposal. When they click on an empty area of the map, it just adds the (unique) SpacePrimitive on the map. Repeat until the whole political entity is covered. If the added SpacePrimitive is too big, then they need to do the split operation. If not existent, then add a new shapefile or edit an existing one.

If yes: solution 2 I think that this solution is "at least as good, but most probably better" than the current implementation:

right now we duplicate geojsons no matter what
if we implement this, we might have duplicate atomic SpacePrimitives. The better we make the search algorithm, the less we have duplicates.
UX proposal, so that we don't even need a search algorithm:
- when the mapper creates the polygon for a political entity, he can hover his mouse over an empty area on the map, right-click and select "Search for territories here", and it would show on the left-pane all territories that covers that particular lat/lng point.
- The mapper can then just select the territory he wants to add to this political entity. Repeat until the entire political entity is covered.

So overall, I'm in favor on implementing this, no matter what, with a priority on solution1 if we find a way to solve the SpacePrimitive splitting problem.

MiklerGM commented 5 years ago

My thoughts about atomic polygons It's a good idea that allows us to solve some problems with merging maps across years. But it's very expensive to add a new border (this is what you usually do).

Main drawback of this proposal is extremely large data overhead for client to download.

This approach is not easy to combine with vector tiles.

Using SpacePrimitives without any duplication data makes it almost impossible to draw additional layers. If the City was grown and you wan't to extend it's borders. Drawing additional layer of religion expansion are also impossible.

By my calculations this approach will not save client's bandwidth. You still need to load the whole world map no matter what, but now with tons of additional useless inner borders. It would be possible not to fetch any additional data after changing year, but it will cost us lots of RAM (it's not suitable for mobile devices)

Right now we don't have any UX for mappers in our roadmap (for 3-5 months it's not a top priority). About which kind of UI you are talking about?

In Chronist we were storing geodata separately from props for the same reasons - avoid duplicated data. We've been storing each unique territory and combine it with right props on demand.

We've been using ShapeFiles (one per year for the whole world) with all necessary data in dbf as a buffer between mappers and database.
It's also not suitable for using it with vector tiles magic

MiklerGM commented 5 years ago

Thinking about atomic polygons from a different perspective.

We can create a list of atomic polygons from shp files. It can be done with python bindings for GIS software. Each mapper will be able to download shapefile with all atomic polygons and edit it in his own copy of QGIS/ArgGIS.

MiklerGM commented 5 years ago

I was wondering how our map will look if we draw every territory border. Keep in mind we did not split countries into regions (amount of maps would be multiplied). It's only last 200 years. It's simplified maps, we keep only 5% of all the data. As a result some lines are doubled, because we are storing whole territories, not atomic polygons, and maps were simplified it's eventually changes the border

amaury1093 commented 5 years ago

Replaced with #11, cf the model structure there

chronhq / backend

Atomic `Polygon` model #1