mbloch / mapshaper

Tools for editing Shapefile, GeoJSON, TopoJSON and CSV files
http://mapshaper.org
Other
3.74k stars 532 forks source link

`run` command that imports the return value #608

Closed indus closed 10 months ago

indus commented 11 months ago

I wonder if there is a way to to run a required command that returns e.g. geojson that gets directly imported in the mapshaper context.

I think this would be handy to act like a simple plugin system.

I'm thinking of using 3rd-party geo libraries like h3-js polygonToCells, geojson-buffer or turf...

maybe it could just be an extension of the run command and look like this: mapshaper somePoly.shp -require 'myH3wraper.js' -run 'poly2Cells(target)' input -o poly_h3.json

The input option after -run would make mapshaper not run the return value as a command but use the return value (Array) as new inputs. Or it could be a new command named '-add' or '-import'

mbloch commented 11 months ago

You can do something like this already using the -each command. There's a getter/setter named this.geojson that is available in -each expressions. The getter returns a GeoJSON Feature and the setter expects a GeoJSON Feature. (see https://github.com/mbloch/mapshaper/blob/master/src/expressions/mapshaper-each-geojson.mjs). Once the this.geojson setter is updated to accept feature collections (and nulls too), you could go:

mapshaper somePoly.shp -require 'myH3wraper.js' -each 'this.geojson = poly2Cells(this.geojson)' -o poly_h3.json

You might want to process the somePoly layer as a single FeatureCollection rather than feature-by-feature... let me think about how the syntax for that might work.

I wonder if there could be a new way to use the -run command (or maybe add a new command)... it could work like this:

-run 'target.geojson = poly2Cells(target.geojson)'
indus commented 11 months ago

I think I used this feature in each before; but didn't thought about it now. Maybe because of its current "on in one out" nature. A modified run, add, import or maybe create command could be designed to make the target to start with optional. But I understand that this would increase the API surface and double functionality. I'll see how far I get with the each command and multipart geometries for 1:n relations. Thanks you for the hint.

indus commented 10 months ago

I can confirm that the -each command with the geojson getter/setter is definitely an option. I tested the h3 case and came up with this contraption (_msh3.js)...

const h3 = require("h3-js");

let offset;

module.exports = {
    h3: function (target, res = 3, keep) {
        offset = 0;
        if (typeof keep === 'string') keep = `"${keep}"`;
        return `
        -explode
        -each this.geojson=h3_each(this.geojson,${res},${keep})
        -explode
        -clean
        -each h3_postex(this)`;
    },
    h3_each: function (feat, res, keep) {
        const hexagons = h3.polygonToCells(feat.geometry.coordinates, res, true);
        const coordinates = hexagons.map(h3Idx => [h3.cellToBoundary(h3Idx, true)]);

        const properties = { $offset: offset, _h3: hexagons };

        if (keep) {
            let props = feat.properties;
            if (typeof keep === 'string') {
                props = Object.fromEntries(
                    keep.split(",")
                        .filter(key => key in props)
                        .map(key => [key, props[key]])
                );
            }

            Object.assign(properties, props);
        }

        offset += hexagons.length;

        return {
            type: "Feature",
            geometry: { type: "MultiPolygon", coordinates },
            properties
        };
    },
    h3_postex: function (feat) {
        Object.keys(feat.properties).forEach((_key) => {
            if (_key.startsWith('_')) {
                let key = _key.slice(1);
                let i = feat.id - (feat.properties.$offset || 0)
                feat.properties[key] = feat.properties[_key][i];
                delete feat.properties[_key];
            }
        });
        delete feat.properties.$offset;
    }
};

... that can be called like this:

mapshaper ne_110m_admin_0_countries.shp `
-filter "['Germany','Italy'].includes(NAME)" `
-require ms_h3.js `
-run "h3(target,5,'NAME,ISO_A3')" `
-o ger_ita_h3_res5.json "format=geojson"

Once the this.geojson setter is updated to accept feature collections (and nulls too) ...

This would make it much easier.

mbloch commented 10 months ago

You found a smart solution... seems like we could make this sort of thing much easier...

This morning I published an update (v0.4.66) that improves the this.geojson setter... now you can assign a null or a FeatureCollection, as well as a Feature or bare geometry.

I'll keep thinking about extending the -run command or something similar.

indus commented 10 months ago

👏 that is a great improvement.

It cuts down the complexity significantly:

const h3 = require("h3-js");

module.exports = {
    h3: function (feat, res = 3, keep) {
        const hexagons = h3.polygonToCells(feat.geometry.coordinates, res, true);

        let props;
        if (keep) {
            props = feat.properties;
            if (typeof keep === 'string')
                props = Object.fromEntries(
                    keep.split(",")
                        .filter(key => key in props)
                        .map(key => [key, props[key]]));

        }

        const features = hexagons.map(h3Idx => {
            const coordinates = [h3.cellToBoundary(h3Idx, true)];
            const properties = Object.assign({ h3: h3Idx }, props);

            return {
                type: "Feature",
                geometry: { type: "Polygon", coordinates },
                properties
            }
        });

        return {
            type: "FeatureCollection",
            features
        };
    }
};
mapshaper `
"ne_110m_admin_0_countries.shp" `
-filter "['Germany','Italy'].includes(NAME)" `
-explode `
-require ms_h3.js `
-each "this.geojson=h3(this.geojson,5,'NAME,ISO_A3')" `
-clean `
-o ger_ita_h3_res5.json "format=geojson"

I think this change is fantastic and opens up many new possibilities. If you are still thinking about using an additional or extended function to create geometries without existing input, I have a few more thoughts on this.

The question of whether this functionality should be a variation of the -run command is linked to how this "run" is to be understood. At first I thought it would mean "run the custom function I give to you"; but what the run command does at the moment is more like "run what my custom function returns to you". If the latter is the intention of the -run command it should not allow data as retun value.

There might be another place where new geometry could be created - the import command -i. If -i would accepted a function that then returns any of the allowed input formats you could not only create geometry but also load data from external sources using a promise. I mean this is possible in programatic use already but for cmd use it would allow something like this:

mapshaper -require ms_import.js -i fetch("https://github.com/nvkelso/.../ne_50m_admin_0_countries.geojson") -o -

or

mapshaper -require ms_create.js -i circleLatLngRadius([10,20],30) -o -
mbloch commented 10 months ago

I'm not quite ready to support function calls as arguments to the -i command, but I've added some new things to the -run command that let you accomplish the same thing. (See the wiki, https://github.com/mbloch/mapshaper/wiki/Command-Reference#-run)

There's a new io object that can be passed to an external function. It has a io.addInputFile(<filename>, <data>) method that lets you import a dataset and reference it in a -i command using a filename. This is a bit similar to the way that data gets imported in mapshaper's Node api function applyCommands()

The target object, which you can also pass to an external function, has a new getter, target.geojson, which returns the target layer as a FeatureCollection. You could then edit the layer's data in your external script and then re-import it.

Finally, I added the ability to put JSON-formatted data (GeoJSON, TopoJSON or an array of JSON records) directly into the command line, like this: -i [{"foo":"bar"}]. If your JSON contains spaces, you would have to quote it: -i '[{"foo": "bar"}]'.

These updates should make the -run command much more versatile.

indus commented 10 months ago

This all sounds great. I'm currently working with -each and return FeatureCollections and it works really well for my current use case as I benefit from a per-feature approach. But I will test the new functionality asap.

indus commented 10 months ago

I've tried it the new options of the run command and think it gives all the flexibility one could think of. The 3-4 steps necessary (require module, run script, add to io, import by name) maybe feels a bit clumsy/verbose but is doable. :+1: Thank you.

mbloch commented 9 months ago

Hi, I've made some updates to the -run command that you might want to know about. There's a new interpolation syntax, where you can put calls to external functions inside curly braces and and use the output in an -i command (for example). The examples in the wiki show how it works (https://github.com/mbloch/mapshaper/wiki/Command-Reference#-run).

This isn't any less verbose than my earlier attempts... I guess I'm going for versatility more than conciseness here. It would be nice to find a less kludgy syntax. Maybe the { } operator could convert data to a temp file for -i to use, to avoid having to call io.ifile() (or equivalent).

indus commented 9 months ago

Thanks for the proactive information. I havn't tested it yet, but maybe some of my expectations by looking at the docs:

-run '-i {io.ifile("voronoi.json", voronoi(target.geojson, target.bbox))}'

I'm going to test my assumptions, and see which hold true...

indus commented 9 months ago

I fail to test right at the start #613 😢

mbloch commented 9 months ago

613 should be resolved now. To your questions...

indus commented 9 months ago

This all sounds very reasonable. I had time to make my first steps with the new syntax and it works quite well for me. 👍