mbloch / mapshaper

Tools for editing Shapefile, GeoJSON, TopoJSON and CSV files
http://mapshaper.org
Other
3.74k stars 532 forks source link

msx/snapshort format without history #606

Open indus opened 11 months ago

indus commented 11 months ago

Is there a way to use the msx snapshort format without the history but only output the last state? I would like to use it because I think it has a good ratio of speed and size for temporary data storage of processing steps.

Shapefiles and NDJSONs are bigger - only gz compression makes them smaller but also much slower to write and read.

-clean only-arcs helps to get rid of the data history when using a filter. but I don't know how to do the same for -simplify

mbloch commented 11 months ago

What mapshaper is saving is not exactly a history state. When you apply simplification, it's like adding a filter over the original vertices that can be removed or updated. When you export a snapshot after applying simplification, you're saving all the original vertices plus simplification threshold data for each vertex. The threshold data lets mapshaper change the amount of simplification on-the-fly, without recomputing anything.

I suspect that very few users expect to be able to undo simplification after re-importing a snapshot, so I'm tempted just to bake in the current simplification amount when you export a snapshot. I would leave the simplification data in the temporary snapshots that get made when you click "create a snapshot" under the ribbon menu, but remove the data when you export a snapshot.

I propose to also automatically remove the path data that you're currently deleting using -clean only-arcs. That data isn't recoverable anyway. Its only purpose is to enable paths removed by -filter to be displayed as ghosted lines.

Does this sound reasonable?

mbloch commented 11 months ago

You asked for a way to remove the data... yes there is. Some commands require that simplification be baked in before they can work. You can use -snap interval=0.

indus commented 11 months ago

your suggested changes would sound as they would indeed help. Just to clarify I would use the msx format in programmatic use. I would guess that ... -o tmp.msx would behave as an 'export' and not a 'temporary snapshot'.

Again - Thanks for your great work.

I will try baking the simplification with the -snap interval=0 for now.

mbloch commented 11 months ago

Yes, -o tmp.msx would behave like an export

mbloch commented 11 months ago

Just published v0.6.45, which includes the update that removes simplification data from snapshot files. This version doesn't clean up paths that were removed by -filter, -dissolve and several other commands. I'm thinking about removing those paths immediately after the commands are run, as opposed to when a snapshot is exported. As I mentioned earlier, I'm retaining them in order to display a ghosted image of the removed paths. I haven't found those images to be particularly useful. (If someone wants to compare the edited layer and the original layer, they can always use the + option to create a new output layer rather than replacing the original layer).

indus commented 11 months ago

Thank you. I hope nobody will miss this easy way to check the outcome of these operations.