Open cassiepearson-deca opened 1 year ago
Another test case is included in the tests
folder in the above linked repository. This test case involves two polygons that overlap each other. After running the difference
operation, the test fails, every time, throwing the exact same stack trace as above.
It seems to work quiet well for the combination of Triangle
+ f64
! No panics so far and all the panics for the cases below (which were f32
) are passing the tests. This is good since you can basically tessellate any (Multi)-Polygon, run the boolops on the resulting triangles and then reassemble the triangles into one (Multi)-Polygon result.
I know this doesn't prove anything, but I didn't run into panics for all operations on Triangle<f64>(..).to_polygon()
which were randomly generated. The random tests covered about 100.000.000 cases and I think that's good enough for me to trust it.
I ran my panic discovering test suite over night and it didn't panic once in 500 runs which amounts to
500 runs x 20 parallel tests x 100.000 attempts per test = 100.000.000 attempts total
Every test, a random boolop is chosen. Then it generates two random triangles and makes sure that they are intersecting before the actual boolop is run. This check is done since otherwise we wouldn't have anything interesting to do with the boolop.
Here is the code I used for testing:
#![allow(dead_code)]
#![allow(unused_results)]
#![allow(unused_must_use)]
#![allow(unused_variables)]
use geo::{BooleanOps, Intersects};
use geo_svg::ToSvg;
use rand::{thread_rng, Rng};
type T = f64;
const NUM_RUNS: usize = 100_000;
const RANGE: f32 = 10_000.0;
fn random_coord(rng: &mut impl Rng) -> geo::Coord<T> {
let range = rng.gen_range(1.0..RANGE);
let x = rng.gen_range(-range..range) as f64;
let y = rng.gen_range(-range..range) as f64;
geo::Coord { x, y }
}
fn random_coords(rng: &mut impl Rng) -> [Option<geo::Coord<T>>; 3] {
(0..3)
.map(|_| random_coord(rng))
.enumerate()
// check that points are unique and we don't have degenerate triangles (lines still
// possible)
.fold([None; 3], |mut vec, (i, p)| {
if !vec.contains(&Some(p)) {
vec[i].replace(p);
}
vec
})
}
fn random_poly(rng: &mut impl Rng) -> geo::Polygon<T> {
let mut cs = random_coords(rng);
while cs.contains(&None) {
cs = random_coords(rng);
}
let p = geo::Polygon::new(
geo::LineString::new(cs.map(|x| x.unwrap()).to_vec()),
vec![],
);
p
}
fn save_polys(ps: [&geo::Polygon<T>; 2], path: &str) {
if std::path::Path::new(path).exists() {}
let sps = serde_json::to_string(&ps).unwrap();
std::fs::write(path, sps);
}
fn test_prog(f: impl Fn(&geo::Polygon<T>, &geo::Polygon<T>) -> geo::MultiPolygon<T>, name: &str) {
let mut rng = thread_rng();
let mut make = || random_poly(&mut rng);
for i in 0..NUM_RUNS {
//let name = format!("{name}-{i}.html");
let mut p1 = make();
let mut p2 = make();
// make sure polys are intersecting
while !p1.intersects(&p2) {
p1 = make();
p2 = make();
}
//save_polys([&p1, &p2], &name);
f(&p1, &p2);
//std::fs::remove_file(name);
}
}
const FNS: [fn(&geo::Polygon<T>, &geo::Polygon<T>) -> geo::MultiPolygon<T>; 3] = [
BooleanOps::difference,
BooleanOps::union,
BooleanOps::intersection,
];
macro_rules! make_test_thread {
($name:ident) => {
paste! {
#[test]
fn [<test_thread_$name>]() {
test_prog(FNS.choose(&mut thread_rng()).unwrap(), "thread");
}
}
};
( $($name:ident),* ) => {
$(
make_test_thread!($name);
)*
};
}
use paste::paste;
use rand::seq::SliceRandom;
make_test_thread!(
t0, t1, t2, t3, t4, t5, t6, t7, t8, t9, t10, t11, t12, t13, t14, t15, t16, t17, t18, t19, t20,
t21, t22, t23, t24, t25, t26, t27, t28, t29, t30, t31, t32, t33, t34, t35, t36, t37, t38, t39
);
Thanks for the analysis! I authored the sweep, and indeed there are unresolved floating point issues in it. The crux is that: when two lines (p1, q1) and (p2, q2) intersect at an interior point p, due to the limited precision, (p1, p, q1), and (p2, p, q2) are not necessarily collinear. In particular, the new segment (p1, p, q1) might spuriously intersect another segment, due to this rounding error. Unfortunately, I think the current implementation is not well suited to handle this adjustment well.
I think using algorithms related to monotonic chains is the way to address this properly; we've some partial progress in another PR on breaking polygon/multi-polygon into monotonic polygons, but yet to work on the bool. ops on it.
@rmanoka sounds great!
Btw. what do you think about making the algorithm falliable? It's a real bummer for my app that it'll always crash if the algo didn't succeed? This robs me the opportunity to
a) just ignore the result if its not that important b) or slightly nudge some points in the poly an tri again
We could still provide a crashing version and implement the approach above via new methods in the form of try_...
. This would also help to not introduce breaking changes and it is in line with the general falliable-naming-scheme of the rust ecosystem.
*edit:
I remember I already tried to do this a year ago or so, but I got stuck at some point where I couldn't change the return type directly to a Result
. It was at a point where the Iterator
trait requires the return type to be of shape Option<Item>
That being said, I know how to resolve this challenge nowadays and there actually come two ideas to mind. Let me know which one would be better in your opinion:
TryIterator
trait with a try_next()
method which returns Result<Option<Item>, BoolopError>
Item
in the Iterator
trait to Result<Item, BoolopError>
so that next()
returns Option<Result<Item, BoolopError>>
and then use the transpose()
method to get to the same result as in the last bulletIn either way, we can just use this throughout the whole algo code and unwrap()
at the end to get the same result in the infalliable boolop code.
@RobWalt @rmanoka In that case, do we have a plan to use f64 instead of f32 to handle the precision problem?
@ggunde-deca
In that case, do we have a plan to use f64 instead of f32 to handle the precision problem?
Not exactly. Although the operations work for triangles with f64, it unfortunately doesn't really generalize to arbitrary geometric shapes (polygons, multipolygons). The thinks rmanoka wrote still apply to a general case with f64 scalars, although panics might not happen as often as in the f32 case. Still, in my opinion, panics happen too often even for f64.
If you're looking for a solution for you app, feel free to track my branch (#1032) which implements falliable boolops via methods like try_intersection(...)
. This way you would be able to catch panics, then slightly offset vertices which cause trouble if it fails and then try again. This is at least a feasible solution for my use case.
Might not be a good general-purpose solution, but I've had success with fractionally truncating the polygons before the difference. My use case is taking a difference when the left
(subject) polygon and the right
(clip) polygon share a significant and identical (or nearly-identical) border, and full 64-bit floating point precision is unnecessary.
The thinking here is that if there's trouble with the calculation due to imperfect precision, then we might as well ensure that all the points are exactly representable and sufficiently distinct from all other points. This is done by scaling up by some resolution factor and rounding all fractional values to integers. I believe any line segment created by points with integer coordinates won't easily fall near (but not on) an integer coordinate if you assume the line isn't too long.
You probably also need to ensure the geometry doesn't contain features which are smaller than the fractional resolution, otherwise the geometry may become invalid.
/// Safely(?) difference two MultiPolygons by scaling them up by some resolution,
/// and rounding all fractional values to integers, which are precisely
/// representable (within the range −2^53 to 2^53 for f64). The difference
/// operation is then well-behaved and doesn't encounter arthmetic errors, and
/// we can scale the result back down to the original size.
fn maybe_safe_difference(left: &MultiPolygon, right: &MultiPolygon) -> MultiPolygon {
let resolution = 4096.0;
let scale_up = |point: Coord| {
coord! {
x: (point.x * resolution).round(),
y: (point.y * resolution).round(),
}
};
let scale_down = |point: Coord| {
coord! {
x: point.x / resolution,
y: point.y / resolution,
}
};
let left = left.map_coords(scale_up);
let right = right.map_coords(scale_up);
let mut diff = left.difference(&right);
diff.map_coords_in_place(scale_down);
diff
}
I dug into the test cases in @cassiepearson-deca's repo a bit. I think the first case, with polygons in boolean_difference_subject_multipolygon.json
is rather simple: a valid MultiPolygon shouldn't have overlapping polygons inside it, which this subject feature has. (btw those polygons looks pretty neat!) The second test case in boolean_difference_polygons.json
with just two polygons doesn't produce a panic in my testing (not on 0.23.1 or 0.26.0).
Update: This method does not always work. I've encountered panics when the right (clip) polygon has a line segment which exactly crosses (or should exactly cross) a point on the interior of the left (subject) multipolygon.
I've found these failure cases also hang indefinitely without panicking when compiling with optimizations. So using catch_unwind
can't save you in every case. Here's a minimum repro for geo 0.26.0
. The geometry is the same as the image in the last comment, but with additional x and y shifts.
//! Usually fails on the second test invocation, or on the second loop iteration.
//! Panics if running without optimizations (`cargo run`) and hangs
//! indefinitely if running with optimizations (`cargo run --release`).
use geo::BooleanOps;
use geo::MapCoordsInPlace;
use geo::{Coord, LineString, MultiPolygon, Polygon};
fn main() {
let geo1 = Polygon::new(
LineString(vec![
Coord { x: -1.0, y: 46.0 },
Coord { x: 8.0, y: 46.0 },
Coord { x: 8.0, y: 39.0 },
Coord { x: -1.0, y: 39.0 },
Coord { x: -1.0, y: 46.0 },
]),
vec![LineString(vec![
Coord { x: 2.0, y: 45.0 },
Coord { x: 7.0, y: 45.0 },
Coord { x: 7.0, y: 44.0 },
Coord { x: 5.0, y: 42.0 },
Coord { x: 5.0, y: 41.0 },
Coord { x: 0.0, y: 43.0 },
Coord { x: 2.0, y: 45.0 },
])],
);
let geo2 = Polygon::new(
LineString(vec![
Coord { x: 0.0, y: 42.0 },
Coord { x: 6.0, y: 44.0 },
Coord { x: 4.0, y: 40.0 },
Coord { x: 0.0, y: 42.0 },
]),
vec![],
);
let mut left = MultiPolygon::new(vec![geo1]);
let mut right = MultiPolygon::new(vec![geo2]);
let shift = |c: Coord| Coord {
x: c.x + 931230.,
y: c.y + 412600.,
};
left.map_coords_in_place(shift);
right.map_coords_in_place(shift);
for i in 0..10 {
println!("{} ", i);
left.difference(&right);
}
}
Turns out Clipper (Rust API in geo-clipper) implements all boolean ops with integers, like in my attempted solution, but has many more considerations for robustness. Worth considering if you're stuck on this.
FYI y'all there is a new "safe" (AFAIK) solution proposal in #1073 It's not perfect but it's something workable.
Edit: Nevermind, hit some edge cases again although the ones listed here worked well :(
There's yet another proposal implementation which looks promising here https://github.com/georust/geo/pull/1089
It's a complete new approach and uses Result
s instead of panicing. It also didn't really hit any of the Err
cases yet but there's always this safety net.
The BooleanOps::difference function panics about half the time when running the included test case due to a failure in the sweep algorithm. Similar datasets succeed without issue. The provided test case was reduced from a larger test case. I attempted to find individual polygons that failed but was unable to find a smaller input Multipolygon.
Test case: https://github.com/Deca-Technologies/geo_boolean_difference_panic
I expected to see the structures successfully difference'd returning the entire subject multipolygon. Instead, geo panics during the difference operation about 50% of the time. The code includes a for loop to run until the failure is hit.
Issue https://github.com/georust/geo/issues/913 may be related but I am not certain as it uses a different version of geo.
Backtrace: