cccs-web / soc-maps

Web mapping application in support of social analysis.
6 stars 5 forks source link

clarifications requests: "How does GeoGig store data?" #18

Open cccs-ip opened 9 years ago

cccs-ip commented 9 years ago

CCCS and Kartoza Pty have settled on using the 'GeoGig' platform (formerly GeoGit) for distributed version-control of shapefile data—for the vector datasets.

GeoGig is written in Java and is available under an OpenSource BSD License. The GeoGig software allows users to import raw geospatial data (currently from Shapefiles, PostGIS or SpatiaLite).

Prior to direct interaction with the software (i.e., informed only by public documentation on GeoGig), it is difficult to grasp what is occurring in the GeoGig 'import' process.

The Boundless article on 'exploring' states: "Unlike a version control system like Git, the content of the current working tree cannot be explored directly as a normal folder containing files, so they are stored instead in a database which holds the complete structure of the repository. This means that any files stored in the same directory that contains the .geogig directory will be ignored by GeoGig." Boundless' illustrations, further suggests the GeoGig behaves in the same way as a database.

Based on this understanding, CCCS' questions are as follow:

  1. Is GeoGig linked to its own database back-end or to Postgres_SQL? How does it version control changes to it's own database? Having a parallel database?
  2. To what extent can/does the GeoGig system act as the database Can we utilize GeoGig for web applications in the same way as with postgreSQL?
  3. What happens if we import both a postgreSQL database as well as shapefiles? Or what if we wish to import different postgreSQL databases aquired from different teams (some of which may contain duplicate data). Are each of these "objects" a unique GeoGig entity, or are change to the objects managed in situ (with each object remaining a separate and distinct entity? Or does GeoGig re-structure the data as part of its 'import' process (such as when importing data into postgresSQL] so that neither 'source'file object is relevant to GeoGig after the initial import?
  4. After import, do the source files disappear; do the original object entities remain completely independet of the database?
  5. Are the challenges of importing multiple and different PostregreSQL databases into GeoGig the same as they would be for merging databases within PostregreSQL (e.g. conflicting schema names)?
  6. Does GeoGig allow us to re-export its data as a single postgreSQL database, or does it keep each file entity separate?

Please help us to clarify our understanding.