Closed theroggy closed 2 years ago
Thanks for raising the issue. Quick look. I'm trying to break it down to isolate the issue:
import geopandas as gpd
from shapely import geometry
import topojson as tp
def visz_arcs(topo):
gdf = gpd.GeoDataFrame(geometry=[geometry.LineString(arc) for arc in topo.to_dict()['arcs']])
gdf.plot(cmap='Dark2')
pth = '/Users/mattijnvanhoek/Downloads/testcase_topojson/testcase_topojson.gpkg'
gdf = gpd.read_file(pth)
# when `shared_coords` set to `True`, a path is considered shared when
# all coordinates appear in both paths (`coords-connected`).
topo = tp.Topology(gdf, prequantize=False, shared_coords=True)
visz_arcs(topo)
# when `shared_coords` set to `False` a path is considered shared when
# coordinates are on the same path (`path-connected`).
# the path-connected strategy is more 'correct', but slower. Default is `True`.
topo = tp.Topology(gdf, prequantize=False, shared_coords=False)
visz_arcs(topo)
With shared_coords=True
it seems there is some overlap with some line strings. These linestrings are not splitted and recognised as duplicate arcs.
This behaviour above is on master version. Not sure if the same happens in latest release.
It would be great if we can isolate it even further. Maybe it is possible to bring it back to 2 or 3 linestrings?
Currently the following gives to many output
from topojson.core.cut import Cut
cut = Cut(gdf, options={'prequantize':False, 'shared_coords':True})
cut.to_svg(separate=True, include_junctions=True)
from topojson.core.dedup import Dedup
dedup = Dedup(gdf, options={'prequantize':False, 'shared_coords':True})
dedup.to_svg(separate=True, include_junctions=True)
With shared_coords=False
it seems OK, but what about here?
Hm, thinking about it. I think that is alright as well. A single point on a line is not a shared path and therefor not seen as a junction. Ah, the green and pink arcs are separated because probably there is the start/end coordinate of the polygon. With a perfect topology these two arcs are combined again.
Probably you know this resource already, but I Just found this really interesting explanation that clarified for me at least what to expect: https://bost.ocks.org/mike/topology/
Yes. I use the same wording/phases to create a certain synergy, but the implementations are different.
Some answers:
shared_coords=False
as well, and it seemed better but I still noticed issues. The performance is 3 times worse though, and the functional advantage offered by shared_coords=False
is not relevant for my use case, so I focused on shared_coords=True
for the momentI think the problem at least starts in the 'join' phase: there is an issue in how the junctions are determined, as there are junctions in the middle of lines. Possibly there are also missing junctions which could explain the overlapping parts.
I have been looking deeper into it based on your feedback and it seems I misunderstood the impact of shared_coords=True
. Both the overlapping pieces and the odd way the "red" line is split can be explained by this. Not sure if this behaviour is really wanted and/or "by design", but my data definitely needs shared_coords=True
to get decent results.
As I stated before I also saw issues when I tried shared_coords=True
but they might indeed be explained by the problem you raise here:
Ah, the green and pink arcs are separated because probably there is the start/end coordinate of the polygon. With a perfect topology these two arcs are combined again.
I'll have a look if I can add support to combine those arcs again...
I'm not sure, I might be doing or expecting something wrong, but it seems to me that the topologies (or the common lines) are not created correctly for the data I'm using.
This is a screenshot of the result I get when I visualize the input polygons + the lines created by topojson (in
Topology.output["arcs"]
).Here is the file with the polygons as shown in the screenshot above + one with the topojson lines: testcase_topojson.zip
This is the script I ran: