JetBrains / lets-plot-kotlin

Grammar of Graphics for Kotlin
https://lets-plot.org/kotlin/
MIT License
430 stars 36 forks source link

geom_line data series is getting sorted on SVG export #39

Closed mlthlschr closed 3 years ago

mlthlschr commented 3 years ago

Currently, I am working with geospatial data. I have a problem if I want to plot the contour lines. The dataset might not be sorted by the x-axis, as the lines might contain curves going back or might even be circles.

When I create a single Feature from a map containing all coordinates of the line via geom_line(map) the plot via PlotSvgExport.buildSvgImageFromRawSpecs(plotSpecRaw) is looking quite strange.

2020-09-14_15-59-23

If I inspect the resulting .svg, I can see that the plotted parts are sorted by x-axis, even though they are not sorted in the lists stored in the map

<path d="M616.0933693197439 278.4722567444842 L616.0933693197439 278.4722567444842 L616.1966374794138 276.7657731094514 L616.4902197569609 275.5389922473987 L616.6647073370405 279.4956226601789 L616.803585206857 274.6645176841994 L617.3199260049732 279.96274306534906 L617.5771072454518 273.76487838532194 L618.269518277375 273.5404928259668 L618.536591103999 280.47966867647483 L619.543554576172 273.66159811618854 L621.0407450590283 274.08468023399473 L621.7343430813053 282.2642678450211 L622.5383312052581 274.5764411095006 L623.7324434878537 275.1421233958681 L623.7553919677157 283.55396054615267 L624.679661779548 275.19612272438826 L624.9914445756585 284.2475635720184 L625.9081967819366 274.77356487192446 L626.1705216472037 284.54429774635355 L627.972768647538 285.60016811219975 L628.0847414030577 273.54363841793383 L628.8990167765878 286.4028183258197 L630.2122237566509 287.67625880186097 L630.7515130346874 288.0767975106428 L630.8927648852114 272.313711963885 L631.7430456324364 288.7090614934277 L632.9130224447581 289.1698907146638 L634.262630461948 270.51862748884014 L634.6357410922647 290.05852044170024 L635.3859189875075 269.9529452024435 L635.947365418484 269.46066006162437 L636.163397660479 291.3906786341977 L636.6259282298852 268.944258715841 L637.3472226934391 268.68317458365345 L638.4986033237074 268.6538157254399 L639.1150469739223 268.7177760951745 L640.4713812696282 269.01293747351156 L641.1756622050307 269.2079641746823 L642.5007390884566 269.679278702446 L643.8495557787246 270.51286057021935 L645.2537653515465 272.0500065051019 L646.2678507657838 273.40313531065476 L646.9278173950734 283.87952931338805 L646.9851885949029 282.57463291779277 L647.0263375933282 274.9402812454791 L647.1501802522107 284.27849522623 L647.4311012994731 281.26763946091523 L647.6835345785366 280.500639289472 L647.7432797590154 276.62893985945266 L648.0633714874857 279.31317832687637 L648.1464608113747 277.9679135011975 " fill="none" stroke-width="1.0" stroke="rgb(0,0,128)" stroke-opacity="1.0">
</path>

In case I plot every single line on it's own, the file gets so huge that I can not see anything anymore.

Is there any way to force the svg export not to sort the given Features by x-axis? What can I do instead?

mlthlschr commented 3 years ago

Okay, I think I found the error. I should have used geom_path instead of geom_line. Is there any possibility to improve the docs a little? :-)

alshan commented 3 years ago

Yes, you absolutely should be using geom_path in this case. Our docs most certainly need improvements. Meanwhile, when in doubt, I would recommend checking ggplot2 documentation. Lets-plot-kotlin is conceptually a ggplot2 derivative.

I'm curious which libraries are you using for geospatial? As far as I know there are no Kontlin counterparts for the Python's Geopandas or Shapely.

alshan commented 3 years ago

PlotSvgExport.buildSvgImageFromRawSpecs(plotSpecRaw)

If you are saving to a file there is suitable ggsave function.

mlthlschr commented 3 years ago

Yes, you absolutely should be using geom_path in this case. Our docs most certainly need improvements. Meanwhile, when in doubt, I would recommend checking ggplot2 documentation. Lets-plot-kotlin is conceptually a ggplot2 derivative.

Looks like that's a good plan

I'm curious which libraries are you using for geospatial? As far as I know there are no Kontlin counterparts for the Python's Geopandas or Shapely.

Unfortunately there are not kotlin alternatives. I am using geotools, to display them I am mapping the GIS-features to the lets-plot format

alshan commented 3 years ago

Interesting. In a matter of fact Let's-Plot internally has quite robust support for geospatial data visualization, but these capabilities are not yet exposed in the Kotlin API. At the moment only Lets-plot Python API supports geospatial data types (geodataframe) and visualization (see geopandas support and the example notebooks).

We are planning to add similar features to the Kotlin API: map, map_join parameters in layers and some data structure which would play the role of geodataframe. This data structure could be as simple as just a container of a standard hashmap with a mandatory entry: "geometry" -> string , where the string is a WKT representation of a geometry like: "MULTIPOLYGON (((180.00000 -16.06713, 180.00000..."

Then geotools would work with lets-plot rather smoothly, I think, providing there is an adapter/converter between FeatureCollection (?) and the data structure above.

mlthlschr commented 3 years ago

Interesting. In a matter of fact Let's-Plot internally has quite robust support for geospatial data visualization, but these capabilities are not yet exposed in the Kotlin API. At the moment only Lets-plot Python API supports geospatial data types (geodataframe) and visualization (see geopandas support and the example notebooks).

Just checked geom_polygon and found the todo.

We are planning to add similar features to the Kotlin API: map, map_join parameters in layers and some data structure which would play the role of geodataframe. This data structure could be as simple as just a container of a standard hashmap with a mandatory entry: "geometry" -> string , where the string is a WKT representation of a geometry like: "MULTIPOLYGON (((180.00000 -16.06713, 180.00000..."

yeah, I guess that would be awesome. In that case "only" a WKT-parser would be needed and not only geotools could be supported, but also other tools which at least have WKT-export. Geotools certainly supports WKT, at least on Feature-level.

Then geotools would work with lets-plot rather smoothly, I think, providing there is an adapter/converter between FeatureCollection (?) and the data structure above.

When working with single Features based on the jts Geometry-type not even a adapter would be needed, as they support wkt export. If I understood the geotools docs correctly, currently, the jts-Geometries are the only ones used.

The FeatureCollection indeed needs an adapter, but I guess that could be handled as well (maybe with some kind of extension function generating GEOMETRYCOLLECTION, MULTIPOLYGON or so).

In any case, if I can help implementing this somehow, I would be glad to do so.

alshan commented 3 years ago

Thank you! I think a good first milestone would be the reproductions of the natural earth Python Jupyter notebook with Kotlin kernel.

We need to figure out how to rewrite input cells 1, 2, 3. Could you try to work it out?

mlthlschr commented 3 years ago

Of course. However, I doubt that it is possible for me to reimplement geopandas, which is basically the only thing done in cells 1-3. Or do you mean by using geotools and any arbitrary geo-dataset I can find?

alshan commented 3 years ago

Yes, I meant using geotools of course. Would be nice to have a piece of code in Kotlin/Java producing the result similar to the table shown in Out[3]. In particular: columns "continent" and "geometry".

To achieve this goal the code should:

This will also tell us what minimal geotools dependencies are required for this kind of tasks. I have 0 experience in geotools and this would help a lot.

mlthlschr commented 3 years ago

Okay, that sounds like a plan! I am going to have a look the next days.

mlthlschr commented 3 years ago

Yes, I meant using geotools of course. Would be nice to have a piece of code in Kotlin/Java producing the result similar to the table shown in Out[3]. In particular: columns "continent" and "geometry".

To achieve this goal the code should:

* read continental boundaries from a shapefile or other geospatial format

* transform coordinates to decimal degrees (WGS84) if necessary

* convert geometries from internal format to WKT strings

See example ex1 here. Quite ugly code, but should be sufficient to prove the point.

alshan commented 3 years ago

Thanx, this is exactly what needed! Is there a way in geotools to explicitly convert the "the_geom" feature attribute to WKT form?

mlthlschr commented 3 years ago

yes. the printed string is the WKT format given by the Geometrie's toText() method, shortened to 50 letters. I updated the example to show it.

alshan commented 3 years ago

Hi, I've added 'SpatialDataset' type which consists of one 'geometry' column and optional 'data' columns. I've also added a converter method (as a Kotlin extension method for SimpleFeatureCollection) for transforming geotools feature collection to SpatialDataset.

SpatialDataset can be passed to a plot layer via the 'data' or 'map' parameter as it the following use cases:

I was able to almost entirely reproduce the GeoPandas notebook in Kotlin notebook: https://nbviewer.jupyter.org/github/JetBrains/lets-plot-kotlin/blob/master/docs/examples/jupyter-notebooks-dev/geotools_naturalearth.ipynb

The only part I skipped was the 'South America' part where we need to filter 'worldFeatures' - drop everything which is not continent == "South America". Do you know how such filter is done in GeoTools?

Of course, this issue https://github.com/Kotlin/kotlin-jupyter/issues/107 is still a showstopper.

There are also simple JVM demos in the project: https://github.com/JetBrains/lets-plot-kotlin/tree/master/demo/geotools-batik/src/main/kotlin/naturalEarth

mlthlschr commented 3 years ago

That is looking fantastic! Got to try it soon.

Regarding the filtering: I have not done it specifically with geotools, but it should be easily possible with SimpleFeatureCollection.subCollection(Filter), as described here and here.

alshan commented 3 years ago

Thanx, the filtering worked like a charm - I've updated the naturalearth example.

alshan commented 3 years ago

@mlthlschr Hi, there is a public Kotlin Kernel release available (kotlin-jupyter-kernel 0.8.3.1) which fixes the issue https://github.com/Kotlin/kotlin-jupyter/issues/107.

Last week we released Lets-PLot Kotlin API v1.1.0 that includes the support for GeoTools's SimpleFeatureCollection, ReferencedEnvelope and Geometry objects - please, check out this updated demo: https://nbviewer.jupyter.org/github/JetBrains/lets-plot-kotlin/blob/master/docs/examples/jupyter-notebooks/geotools_naturalearth.ipynb

I think this exhausts all the issues discussed in this thread and I'm closing it now. )

alshan commented 3 years ago

@mlthlschr Make sure you include the %useLatestDescriptors jupyter line magic because kotlin-jupyter-kernel 0.8.3.1 still doesn't have the latest Lets-Plot package bundled with it.