sofwerx / cdb2-concept

CDB modernization
0 stars 1 forks source link

Versioning and update - and a bit on date/time stamps #2

Open cnreediii opened 4 years ago

cnreediii commented 4 years ago

Tracey Birch

We had some discussion yesterday of versioning / change tracking, and cataloging came up a little bit there (esp. as it relates to current CDB versioning), but no giant AHAs. Looking into how git stores things on the repository side as a possible line of investigation.

Greg Peele

feature-level and collection-level revisioning via deltas is incredibly useful when you need that audit trail. I prototyped ideas for this in LTF 1.0 to do minimal deltas for changes to feature type, attribute values, relationships to other features, and relative move/rotate/scale of entire feature. didn't quite go all the way with it to allow for vertex-level deltas though, and never did think of a good solution for continuous time-varying deltas.

Git may provide some insights but I suspect the final solution needs to be geometry and spatial aware.

Tracey Birch

Yes, we decided it is a hard problem and were looking at catalogs to simplify/speed up doing versioning by entire CDB &/or entire GPKG files

Greg Peele

one piece of the prototype I did was a catalog level revision history as well, which defined your revision IDs and timestamps. then the deltas were indexed to a particular revision. I used closer to the SVN model (incremental increasing revision IDs) but the Git model handles splits and merges way better.

of course anything that goes in a central catalog then is a data loss risk if writes fail, and can introduce write concurrency performance hits - once you get down to that level. journaled filesystems like ext3/ext4 and NTFS have to deal with similar issues.

"it is a hard problem" - accurate.

Tracey Birch

I thought the journaling capabilities of sqlite might be something to investigate as well

Greg Peele

I also realized there were two different types of revisioning time variance that could be relevant, which complicated things. the first is actual time variance in the simulated history due to temporal changes. the second is editing decisions that are correcting errors or making changes that have nothing to do with the simulated timeline. (edited)

in the geoint case that distinction may be not important, but in the M&S case it can be.

if you really want your brain to hurt, there is a game out there called Achron which is a military real time strategy game with multiplayer time travel. describing the simulation timeline in that simulation is a mind screw. as far as I know the DoD hasn't implemented time travel yet though so I think we can ignore that one.

in any case, I ended up settling on using "revisions" to describe changes outside the simulated timeline and "events" to describe changes that are a result of time variance inside the simulated world.

terminology is hard on some of these things.

Tracey Birch

I like that terminology, at least makes some distinction

We also talked about changes that were real-world events vs corrections or improvements to the data

Carl Reed

WRT time/temporal, the recent VTP2 Metadata activity participants identified a number of date/time metadata elements associated with the life and provenance geo-content. I need to re-read the Engineering Report on that topic.