indus commented 11 months ago

I wonder if there is a way to to run a required command that returns e.g. geojson that gets directly imported in the mapshaper context.

I think this would be handy to act like a simple plugin system.

I'm thinking of using 3rd-party geo libraries like h3-js polygonToCells, geojson-buffer or turf...

maybe it could just be an extension of the run command and look like this: mapshaper somePoly.shp -require 'myH3wraper.js' -run 'poly2Cells(target)' input -o poly_h3.json

The input option after -run would make mapshaper not run the return value as a command but use the return value (Array) as new inputs. Or it could be a new command named '-add' or '-import'

mbloch commented 11 months ago

You can do something like this already using the -each command. There's a getter/setter named this.geojson that is available in -each expressions. The getter returns a GeoJSON Feature and the setter expects a GeoJSON Feature. (see https://github.com/mbloch/mapshaper/blob/master/src/expressions/mapshaper-each-geojson.mjs). Once the this.geojson setter is updated to accept feature collections (and nulls too), you could go:

mapshaper somePoly.shp -require 'myH3wraper.js' -each 'this.geojson = poly2Cells(this.geojson)' -o poly_h3.json

You might want to process the somePoly layer as a single FeatureCollection rather than feature-by-feature... let me think about how the syntax for that might work.

I wonder if there could be a new way to use the -run command (or maybe add a new command)... it could work like this:

-run 'target.geojson = poly2Cells(target.geojson)'

indus commented 11 months ago

I think I used this feature in each before; but didn't thought about it now. Maybe because of its current "on in one out" nature. A modified run, add, import or maybe create command could be designed to make the target to start with optional. But I understand that this would increase the API surface and double functionality. I'll see how far I get with the each command and multipart geometries for 1:n relations. Thanks you for the hint.

indus commented 10 months ago

I can confirm that the -each command with the geojson getter/setter is definitely an option. I tested the h3 case and came up with this contraption (_msh3.js)...

const h3 = require("h3-js");

let offset;

module.exports = {
    h3: function (target, res = 3, keep) {
        offset = 0;
        if (typeof keep === 'string') keep = `"${keep}"`;
        return `
        -explode
        -each this.geojson=h3_each(this.geojson,${res},${keep})
        -explode
        -clean
        -each h3_postex(this)`;
    },
    h3_each: function (feat, res, keep) {
        const hexagons = h3.polygonToCells(feat.geometry.coordinates, res, true);
        const coordinates = hexagons.map(h3Idx => [h3.cellToBoundary(h3Idx, true)]);

        const properties = { $offset: offset, _h3: hexagons };

        if (keep) {
            let props = feat.properties;
            if (typeof keep === 'string') {
                props = Object.fromEntries(
                    keep.split(",")
                        .filter(key => key in props)
                        .map(key => [key, props[key]])
                );
            }

            Object.assign(properties, props);
        }

        offset += hexagons.length;

        return {
            type: "Feature",
            geometry: { type: "MultiPolygon", coordinates },
            properties
        };
    },
    h3_postex: function (feat) {
        Object.keys(feat.properties).forEach((_key) => {
            if (_key.startsWith('_')) {
                let key = _key.slice(1);
                let i = feat.id - (feat.properties.$offset || 0)
                feat.properties[key] = feat.properties[_key][i];
                delete feat.properties[_key];
            }
        });
        delete feat.properties.$offset;
    }
};

... that can be called like this:

mapshaper ne_110m_admin_0_countries.shp `
-filter "['Germany','Italy'].includes(NAME)" `
-require ms_h3.js `
-run "h3(target,5,'NAME,ISO_A3')" `
-o ger_ita_h3_res5.json "format=geojson"

wrapping the -each command with a -run command allows for pre- and post-processing options.
exporting a multipart features in the -each command and spliting them with -explode afterwards allows 1:n relations
it is possible (but not failsafe) to write properties to the individual output features (in this case the h3 index)

Once the this.geojson setter is updated to accept feature collections (and nulls too) ...

This would make it much easier.

mbloch commented 10 months ago

You found a smart solution... seems like we could make this sort of thing much easier...

This morning I published an update (v0.4.66) that improves the this.geojson setter... now you can assign a null or a FeatureCollection, as well as a Feature or bare geometry.

I'll keep thinking about extending the -run command or something similar.

indus commented 10 months ago

👏 that is a great improvement.

It cuts down the complexity significantly:

const h3 = require("h3-js");

module.exports = {
    h3: function (feat, res = 3, keep) {
        const hexagons = h3.polygonToCells(feat.geometry.coordinates, res, true);

        let props;
        if (keep) {
            props = feat.properties;
            if (typeof keep === 'string')
                props = Object.fromEntries(
                    keep.split(",")
                        .filter(key => key in props)
                        .map(key => [key, props[key]]));

        }

        const features = hexagons.map(h3Idx => {
            const coordinates = [h3.cellToBoundary(h3Idx, true)];
            const properties = Object.assign({ h3: h3Idx }, props);

            return {
                type: "Feature",
                geometry: { type: "Polygon", coordinates },
                properties
            }
        });

        return {
            type: "FeatureCollection",
            features
        };
    }
};

mapshaper `
"ne_110m_admin_0_countries.shp" `
-filter "['Germany','Italy'].includes(NAME)" `
-explode `
-require ms_h3.js `
-each "this.geojson=h3(this.geojson,5,'NAME,ISO_A3')" `
-clean `
-o ger_ita_h3_res5.json "format=geojson"

I think this change is fantastic and opens up many new possibilities. If you are still thinking about using an additional or extended function to create geometries without existing input, I have a few more thoughts on this.

The question of whether this functionality should be a variation of the -run command is linked to how this "run" is to be understood. At first I thought it would mean "run the custom function I give to you"; but what the run command does at the moment is more like "run what my custom function returns to you". If the latter is the intention of the -run command it should not allow data as retun value.

There might be another place where new geometry could be created - the import command -i. If -i would accepted a function that then returns any of the allowed input formats you could not only create geometry but also load data from external sources using a promise. I mean this is possible in programatic use already but for cmd use it would allow something like this:

mapshaper -require ms_import.js -i fetch("https://github.com/nvkelso/.../ne_50m_admin_0_countries.geojson") -o -

or

mapshaper -require ms_create.js -i circleLatLngRadius([10,20],30) -o -

mbloch commented 10 months ago

I'm not quite ready to support function calls as arguments to the -i command, but I've added some new things to the -run command that let you accomplish the same thing. (See the wiki, https://github.com/mbloch/mapshaper/wiki/Command-Reference#-run)

There's a new io object that can be passed to an external function. It has a io.addInputFile(<filename>, <data>) method that lets you import a dataset and reference it in a -i command using a filename. This is a bit similar to the way that data gets imported in mapshaper's Node api function applyCommands()

The target object, which you can also pass to an external function, has a new getter, target.geojson, which returns the target layer as a FeatureCollection. You could then edit the layer's data in your external script and then re-import it.

Finally, I added the ability to put JSON-formatted data (GeoJSON, TopoJSON or an array of JSON records) directly into the command line, like this: -i [{"foo":"bar"}]. If your JSON contains spaces, you would have to quote it: -i '[{"foo": "bar"}]'.

These updates should make the -run command much more versatile.

indus commented 10 months ago

This all sounds great. I'm currently working with -each and return FeatureCollections and it works really well for my current use case as I benefit from a per-feature approach. But I will test the new functionality asap.

indus commented 10 months ago

I've tried it the new options of the run command and think it gives all the flexibility one could think of. The 3-4 steps necessary (require module, run script, add to io, import by name) maybe feels a bit clumsy/verbose but is doable. :+1: Thank you.

mbloch commented 9 months ago

Hi, I've made some updates to the -run command that you might want to know about. There's a new interpolation syntax, where you can put calls to external functions inside curly braces and and use the output in an -i command (for example). The examples in the wiki show how it works (https://github.com/mbloch/mapshaper/wiki/Command-Reference#-run).

This isn't any less verbose than my earlier attempts... I guess I'm going for versatility more than conciseness here. It would be nice to find a less kludgy syntax. Maybe the { } operator could convert data to a temp file for -i to use, to avoid having to call io.ifile() (or equivalent).

indus commented 9 months ago

Thanks for the proactive information. I havn't tested it yet, but maybe some of my expectations by looking at the docs:

-run '-i {io.ifile("voronoi.json", voronoi(target.geojson, target.bbox))}'

The string after the the run command looks to me equivalent to the string that would get returned from a function added with "require" (so I would expect to be able to use the curly braces there as well)
as the string for run always was build from regular commands could I would expect that a direct mapshaper -i {io.ifile(...} is now a thing as well (only because you havn't mentioned it, makes me doubt)
the docs no only has examples with the curly brace pattern. I would expect that the old -run target.geojson=myTransformFn(target) and -run myGeneratorFn(io) -i mynew.json is still possible?

I'm going to test my assumptions, and see which hold true...

indus commented 9 months ago

I fail to test right at the start #613 😢

mbloch commented 9 months ago

613 should be resolved now. To your questions...

The { ... } syntax is for running snippets of code and interpolating the output into a string, similar to other template systems (like JavaScript's backtick-delimited template literals with ${...} interpolation, { ... } interpolation in Svelte and other web frameworks, mustache templates, etc.). It works on the command line but not currently inside command strings returned by functions, although it could work that way as well.
The { ... } syntax currently only works in -run, but I am thinking about ways to make input to other commands more dynamic, that was part of my motivation in adding it. So in the future it could be added to -i, for example.
The other syntax should continue to work, I tried to keep backwards compatibility. The examples in the docs are trying to present a consistent syntax that would work for all use cases, including both using output from external scripts and interpolation of simple data values.

indus commented 9 months ago

This all sounds very reasonable. I had time to make my first steps with the new syntax and it works quite well for me. 👍

mbloch / mapshaper

`run` command that imports the return value #608

613 should be resolved now. To your questions...