Incorporate num_traits crate to make Coord, and all algorithms, generic over an arbitrary numeric type

bluenote10 commented 4 years ago

I'm curious how that would work. I haven't looked into the robust predicates papers, but it looked like it was tailored to double precision floats. If the interface would support arbitrary numeric types and internally convert tof64, does the result still make sense? If a f32 point lies perfectly on a f32 line segment it, the upcasted f64 point doesn't necessarily, right?

Stoeoef commented 4 years ago

If a f32 point lies perfectly on a f32 line segment it, the upcasted f64 point doesn't necessarily, right?

Actually the results should be the same. predicate.rs calculates the mathematical correct answer to an orientation query (left / right / on line). As both f32 and a f32 converted to f64 encode exactly the same value¹, the correct answer to the orientation query does not change.

To illustrate a bit further: There is nothing special about "f32 lines", they are just lines which can be exactly described by two f32 points. As it happens, all these lines can also exactly represented with f64 points. Since predicates.rs returns the correct answer for lines representable with f64 points and their values have not changed during conversion, the result is also correct for the f32 input points.

I hope that makes some sense... I'm having trouble to find good words for describing this. Let me know if I should elaborate further, I'd be interested in someone validating this logic.

If the interface would support arbitrary numeric types and internally convert tof64, does the result still make sense?

If the argument above is correct: Yes, the result should still make sense.

¹: The IEEE-754 floating point standard guarantees that converting a f32 to f64 must be exact (Section 5.3). That's a vital cornerstone in the argumentation above!

bluenote10 commented 4 years ago

@Stoeoef I think you are absolutely right. I had a wrong mental model what "robust predicate" does. I thought that it would report "is on line" for all the floats that are the best approximation to being on the line. I did a small check, which enumerates a small "grid" of floats (using nextafter), evaluating orient2d for every possible point -- the other two points were the lower-left / upper-right of the grid:

Colors represent larger/equal/less then zero. In my mental model the blue line would make an attempt to connect the lower-left / upper-right without gaps, much like the Bresenham's line algorithm, i.e., for every x there would 1 or more y values which are on the line (or vice versa). Such a logic would obviously be quite different for float vs double. But that is nonsense. The key is as you said:

predicate.rs calculates the mathematical correct answer to an orientation query (left / right / on line)

urschrei commented 4 years ago

@frewsxcv Just getting started on this. Two questions:

Where we're currently returning f64 should we retain that or switch to T: Float?
Upcasting from f32 to f64 can fail, which implies that (almost) all functions now have to return Result instead…are we OK with that (Although I don't really see a way around it)?

bluenote10 commented 4 years ago

I was also thinking about this. A few thoughts:

Running the algorithms in f32 on a target that natively supports f64 would violate the philosophy of the approach and results in quite poor performance, because f64 precision basically prevents the adaptive logic to ever go into the slow paths. In such a case it would make sense to enforce f64 internally and adjust only the machine epsilon for the slower path thresholds.
On the other hand on exotic hardware without native f64, running in true f32 would make sense.

Should the choice be a feature flag?

Upcasting from f32 to f64 can fail

I wasn't aware of that. Under what circumstances does it actually fail? As noted in the other thread, I would try to avoid additional calling overhead, because in the fast path (= majority of point on line checks) the function should be close to the native perp product. Note that the native perp product is a really fast operation (~10 cycles, 2.7ns on my machine) and even in the fast path just the two additional if branches (adaptiveness check) seem to make orient2d roughly 5 times slower.

urschrei commented 4 years ago

I wasn't aware of that. Under what circumstances does it actually fail?

https://rust-num.github.io/num/num/trait.ToPrimitive.html#method.to_f64

My mistake, it's actually Option, not Result. I don't know how it can fail and return None – possibly in the presence of NaN? The issue is more that if we use num-traits and upcast we either have to propagate the Option or unwrap, which I suspect we don't want to do.

frewsxcv commented 4 years ago

Hm. Is there anywhere in the code where we need to use f64 or need to use f32? If the user is using an f32 coordinate pair, my instinct is to say we should be doing all calculations with f32s and return an f32, never upcasting or downcasting. Similarly for f64 and any type implementing Float. Is this possible?

urschrei commented 4 years ago

Is this possible?

Oh! I got confused and thought that we were going to use f64 internally, but if the algorithms don't intrinsically require double-precision, then there's no problem.

frewsxcv commented 4 years ago

but if the algorithms don't intrinsically require double-precision, then there's no problem

For the sake of overcommunicating, I'm not familiar enough with the algorithm to know whether this is true or not. I hope it is though!

urschrei commented 4 years ago

I'm not familiar enough with the algorithm

Me neither. I'm just not sure how this is going to interact with the constants, which are defined as f64: we can't multiply T: Float with f64 e.g. here without upcasting.

bluenote10 commented 4 years ago

The issue is more that if we use num-traits and upcast we either have to propagate the Option or unwrap, which I suspect we don't want to do.

To my knowledge upcasting IEEE 754 single precision to double precision can never fail. See for instance here:

Casting from an f32 to an f64 is perfect and lossless

That is one of the reasons why I'm not a fan of the Float trait. It often forces to bubble conversion errors that have basically not practical relevance :-/.

If the user is using an f32 coordinate pair, my instinct is to say we should be doing all calculations with f32s and return an f32, never upcasting or downcasting.

I'm pretty much sure that Jonathan Shewchuk would cringe if he would see this on f64 capable hardware ;). The whole point of adaptive precision is to only use slow algorithm paths in cases when hitting the precision limit of the maximal native floating point type. If we run the algorithm in f32 we would constantly run into the slow paths although we could simply produce a result with twice the precision with almost negligible overhead!

urschrei commented 4 years ago

To my knowledge upcasting IEEE 754 single precision to double precision can never fail. See for instance here […]

OK, so we can upcast and unwrap with confidence? I'm fine with that as long as we also write some careful tests.

What about downcasting on the way back out, though (as in orient2d above)?

bluenote10 commented 4 years ago

I'm not familiar enough with the algorithm to know whether this is true or not.

From my understanding the SPLITTER and all the epsilon would have to be adapted to f32 (the splitter should split in a corresponding bit, and the machine epsilon is similarly 0.5 * ulp(1)). And probably the maximum sizes of the expansions have to be doubled.

bluenote10 commented 4 years ago

OK, so we can upcast and unwrap with confidence?

That's the problem with the trait, it hides properties like perfect casting of single to double precision. For f32 and f64 everything is fine, but custom implementations may fail. It's such a rare use case that it would be a pity to butcher interfaces and degrade performance to support it.

What about:

documenting that we only support types that never panic on to_f64().unwrap()? (most convenient for the standard f32/f64 usage)
introducing an inherited trait that has to_f64_unwrapped() -> f64?

bluenote10 commented 4 years ago

What about downcasting on the way back out, though (as in orient2d above)?

Tricky question. I guess I wouldn't even do that: The return value of orient2d has exactly the precision that is needed to make an exact comparison to zero. I'm not sure if it's a valid concern, but technically it would be possible to have a non-zero f64 which becomes zero when casting to f32, leading to a wrong point-on-line result. If we internally make the effort of computing it to a higher precision, it makes sense to offer the result without any loss.

Stoeoef commented 4 years ago

IMHO the output value of orient2d should always be f64. Downcasting the result can loose information and is not really helpful.

As it happens, the exact returned value is really not important. More important is the information if the points are oriented clock wise, counter clock wise or collinear. Spade introduced an own return type for this problem (EdgeSideInfo, docs.rs). Maybe this would be a good part of the public interface? This is similar to std::ord::Ordering, just for comparing three vertices instead of two numbers.

bluenote10 commented 4 years ago

As it happens, the exact returned value is really not important.

Actually the value has its purpose, because it is half the area of the triangle defined by the 3 points -- or at least an approximation of that.

A possible use case of this signed area is for computing the distance of a point to a line (divide the result obtained from orient2d by the line segment length). In this case the adaptive nature of orient2d comes in handy: Internally the area is computed with increasing precision if its value gets closer to 0, so we get a better distance-to-line approximation if a point actually is close to a line.

This feature is probably required for rust-geo-booleanop, so it would be nice to keep it in the interface.

urschrei commented 4 years ago

Just to clarify, @bluenote10 – you're OK with an f64 return type, though?

bluenote10 commented 4 years ago

@urschrei Yes, from my perspective f64 is the most reasonable return type.

urschrei commented 4 years ago

Closing, as https://github.com/georust/robust/pull/9 has been merged.

georust / robust

Incorporate num_traits crate to make Coord, and all algorithms, generic over an arbitrary numeric type #3