AngusJohnson / Clipper2

Polygon Clipping and Offsetting - C++, C# and Delphi
Boost Software License 1.0
1.47k stars 270 forks source link

FP exception in GetIntersectPoint() #317

Closed bahvalo closed 1 year ago

bahvalo commented 1 year ago

Executing the following code results in a floating point exception in function Point64 GetIntersectPoint(const Active& e1, const Active& e2). Namely, when a double value is cast to int64_t, it appears to be outside the range of int64_t.

The coordinates of the polygon vertices are within the range -4.6e18 ... 4.6e18. According to the documentation, that is a valid input.

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fenv.h>
#include <glob.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );

    Path64 a, b;
    ADD(a,          -7009388980LL,     1671079308836362LL);
    ADD(a,  -576460711222920704LL,     1671063793410802LL);
    ADD(a,  -864690986154904320LL,  -286977111392363424LL);
    ADD(a, -1152921411648951168LL,  -575625207815834176LL);
    ADD(a,  -864691057648438912LL,  -864273328105026304LL);
    ADD(a,  -576460762799345152LL, -1152921434302619648LL);
    ADD(a,         -12880478468LL, -1152921451122536832LL);
    ADD(a,   576460787720869760LL, -1152921504606846976LL);
    ADD(a,   864691268131862400LL,  -864273433561362176LL);
    ADD(a,  1152921504606846848LL,  -575625211080225088LL);
    ADD(a,   864691062448491520LL,  -286977010832613792LL);
    ADD(a,   576460654512679296LL,     1671130833237401LL);

    ADD(b,   -54234486065476976LL,  1151250415789406208LL);
    ADD(b,  -612617068314194048LL,  1152921504606846976LL);
    ADD(b,  -864691197085273984LL,   867615548203339008LL);
    ADD(b, -1152921504606846848LL,   578967376389736064LL);
    ADD(b,  -864691140504451968LL,   290319223463759488LL);
    ADD(b,  -576460711222912256LL,     1671063793410802LL);
    ADD(b,          -7009388980LL,     1671079308836362LL);
    ADD(b,   576460654512679296LL,     1671130833237401LL);
    ADD(b,   864690908098943872LL,   290319324023499392LL);
    ADD(b,  1100313577101746304LL,   576428327042791872LL);
    ADD(b,   785779376874207360LL,   863806873623168128LL);
    ADD(b,   487696647658790656LL,  1150382388220066304LL);

    const int m = 10;
    for(size_t i=0; i<a.size(); i++) { a[i].x /= m; a[i].y /= m; b[i].x /= m, b[i].y /= m; }

    Paths64 AA; AA.push_back(a);
    Paths64 BB; BB.push_back(b);
    Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);
}

Division by ten is not essential.

AngusJohnson commented 1 year ago

I couldn't find feenableexcept in MS Visual Studio C++ but I did try feraiseexcept instead but it wasn't raising overflow exceptions. Nevertheless hopefully this is fixed now.

sergey-239 commented 1 year ago

Angus,

8236cc58f80948e7e740f20c3ce8211c04bfe0e6 fixes the issue. Don't you think it would be useful to include

feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW ); into CI tests so FP errors could be caught. It might be #ifdef-ed as it looks like it is platform-dependant.

AngusJohnson commented 1 year ago

sergey, I'd certainly be happy to do that if I could find comparable code for other platforms (eg Windows).

Edit: I did try feraiseexcept but misunderstood how to use it :(.

alexisnaveros commented 1 year ago

Hey, x86/amd64 SSE intrinsics provide a generic way to access the control register without inline assembly or other wrappers.

_MM_SET_EXCEPTION_STATE( _MM_EXCEPT_INVALID | _MM_EXCEPT_DIV_ZERO | _MM_EXCEPT_OVERFLOW );

That's just a macro calling _mm_getcsr() and _mm_setcsr() under the hood, to conveniently preserve the other flags (rounding mode, denormals, etc.).

Remember to #include and that's good to go on all x86/amd64 platforms.

sergey-239 commented 1 year ago

Hey, x86/amd64 SSE intrinsics provide a generic way to access the control register without inline assembly or other wrappers.

_MM_SET_EXCEPTION_STATE( _MM_EXCEPT_INVALID | _MM_EXCEPT_DIV_ZERO | _MM_EXCEPT_OVERFLOW );

Not exactly:_MM_SET_EXCEPTION_MASK, but msvc has more appropriate functions to do this. Also, it affects only the SSE's csr, not FPU's one.

Have been playing a lot with different settings and found that: 1) gcc allows the use of FPU or SSE instruction set to perform FP calculations (a mix of both is also available but in experimental state, so is not actual presently). I did not find any specific flag of msvc that controls instruction set to use for FP calculations apart from /arch that probably enables SSE instructions for FP as well; 2) the FPU uses 80-bit precision internally while SSE uses 64-bit for double. This feature of FPU is adjustable, but by default it is set to 80-bit mode. This could lead (and does, see 4) to different results of clipper2 operations depending on instruction set in use, platform, e.t.c. even compiler optimisations may result in a different output, see the next point; 3) even the basic set of optimisations (just -O with gcc) eliminates exception with dataset in the OP when calculations are done at FPU (I am working with commit a2036d2). 4) The ConsoleDemo1 benchmark produces different output results for FPU and SSE instructions sets. SSE is approximately 4% faster than FPU on my notebook,

Still looking into the issue...

bahvalo commented 1 year ago

Thank you for the quick response. Your commit fixes FPE for the data above, but I have another data where FPE occurs.

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fenv.h>
#include <glob.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );

    Path64 a, b;
    ADD(a,5873531643786437LL,-10907856004334190LL);
    ADD(a,-572063233247808512LL,-10907856357909624LL);
    ADD(a,-861031614229043072LL,-295680892812377088LL);
    ADD(a,-1149999997847394560LL,-580453927885307520LL);
    ADD(a,-861031615255601024LL,-865226963295435776LL);
    ADD(a,-572063233981697600LL,-1149999998685980416LL);
    ADD(a,5873531569489655LL,-1149999999013396096LL);
    ADD(a,583810297809434496LL,-1150000000000004736LL);
    ADD(a,872778682431101440LL,-865226965266477056LL);
    ADD(a,1149999999999995904LL,-556893216519906624LL);
    ADD(a,855158084855340160LL,-260339823793244128LL);
    ADD(a,572063232808458240LL,12652856321515572LL);
    ADD(b,5873531342419534LL,1128184286593806848LL);
    ADD(b,-572063234332976384LL,1128184286913500800LL);
    ADD(b,-861031618484528512LL,843411252051976192LL);
    ADD(b,-1150000000000004352LL,558638215679646656LL);
    ADD(b,-861031617158135040LL,273865179743011296LL);
    ADD(b,-572063233247800064LL,-10907856357909624LL);
    ADD(b,5873531643786437LL,-10907856004334190LL);
    ADD(b,572063232808458240LL,12652856321515572LL);
    ADD(b,855158081926256640LL,309206248762144256LL);
    ADD(b,1110168977794405120LL,604014641445566208LL);
    ADD(b,813032149122702592LL,876134821681722496LL);
    ADD(b,543979277405183232LL,1149999999999995136LL);

    const int m = 1000;
    for(size_t i=0; i<a.size(); i++) { a[i].x /= m; a[i].y /= m; b[i].x /= m, b[i].y /= m; }

    Paths64 AA; AA.push_back(a);
    Paths64 BB; BB.push_back(b);
    Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);
}
sergey-239 commented 1 year ago

@AngusJohnson, the code from the above message also passes through without exception when compiled with -O -mfpmath=387... options using commit a2036d2

AngusJohnson commented 1 year ago
    Point64 GetIntersectPoint(const Active& e1, const Active& e2)
    {
      if ((std::abs(e1.dx) > 1e-5 && std::abs(e2.dx) > 1e-5) ||  
        std::abs(q) < 1e-5) return GetEndE1ClosestToEndE2(e1, e2); // almost parallel
        ...
    }
bahvalo commented 1 year ago

I get exceptions no more, but now the area of the polygons intersection may be essentially inaccurate.

For the following example, old version of the library gives S = 3.234567e+32, which seems to be the correct result. New version gives S = 8.100581e+34.

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fenv.h>
#include <glob.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );

    Path64 a,b;
    ADD(a,1149999999999999872LL,-229296316200567744LL);
    ADD(a,1149221101861566976LL,-227336697928099232LL);
    ADD(a,968861260885853056LL,226427991054104864LL);
    ADD(a,614267143917128320LL,221171937614650112LL);
    ADD(a,261697222618302048LL,221187051938477248LL);
    ADD(a,88475044460250816LL,-235198106601525184LL);
    ADD(a,-88830866724475696LL,-691613528989746176LL);
    ADD(a,88475009902961744LL,-1148028933777700480LL);
    ADD(a,89240719470182928LL,-1150000000000015488LL);
    ADD(b,89005309387580320LL,-234740471586376544LL);
    ADD(b,262225952131213568LL,221645974159703744LL);
    ADD(b,82876673529597968LL,678044914286996864LL);
    ADD(b,-98488636073593504LL,1139574121595159040LL);
    ADD(b,-452066572919640704LL,1142151749566806400LL);
    ADD(b,-803550496675935104LL,1149999999999984512LL);
    ADD(b,-974716117373909248LL,693683731894266112LL);
    ADD(b,-1149999999999999872LL,237307541261536192LL);
    ADD(b,-971740827905732608LL,-226886938540897120LL);
    ADD(b,-795463157809728640LL,-685921522829901184LL);
    ADD(b,-442910854088512768LL,-691159864242905472LL);
    ADD(b,-88299066280856784LL,-691157211526877184LL);

    Paths64 AA; AA.push_back(a);
    Paths64 BB; BB.push_back(b);
    Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);

    double S = 0.0;
    for(size_t j=0; j<solution.size(); j++) S += Area(solution[j]);
    printf("S = %e\n", S);
}
AngusJohnson commented 1 year ago

I'm getting 3.234567e+32 but I've also updated the code to avoid those time expensive calls to std::fabs().

    union eight { int64_t ui64; double d; };
    inline bool IsLarge(double val)
    {
        //https://en.wikipedia.org/wiki/Double-precision_floating-point_format
        eight e; e.d = val;
        return (e.ui64 & 0x4000000000000000LL &&  //exponent is positive 
            ((e.ui64 & 0x3FFFFFFFFFFFFFFFLL) >> 56) > 0); // exponent > 5
    }

    inline bool IsSmall(double val)
    {
        eight e; e.d = val;
        return !(e.ui64 & 0x4000000000000000LL) &&  //exponent is negative 
            (e.ui64 & 0x3F00000000000000LL) != 0x3F00000000000000LL; //exponent < -5
    }

    Point64 GetIntersectPoint(const Active& e1, const Active& e2)
    {
        double b1, b2, q = (e1.dx - e2.dx);
        if ((IsLarge(e1.dx) && IsLarge(e2.dx)) || IsSmall(q))
            return GetEndE1ClosestToEndE2(e1, e2); // almost parallel

Edit: I've just figured out why the std::fabs() code above was so slow... because it should have been

 if ((std::fabs(e1.dx) > 1e+5 && std::fabs(e2.dx) > 1e+5) ||  std::abs(q) < 1e-5) 
  return GetEndE1ClosestToEndE2(e1, e2); // almost parallel

Somehow, my translation from the Delphi code got mucked up. Anyhow, the std::fabs() code is a lot cleaner and marginally faster than my IEEE754 double hack.

sergey-239 commented 1 year ago

Somehow, my translation from the Delphi code got mucked up. Anyhow, the std::fabs() code is a lot cleaner and marginally faster than my IEEE754 double hack.

Sometimes things are not obvious. My first thought when I looked at InsertScanLine was "my God, the insertion and deletion from priority queue is O(log(n)) in both cases, while we need to check for a topmost value only once. Let's do it faster!" So, I reimplemented the thing with simple vector, binary search on insertion and avoiding duplicates on insertion, then picking up the tail value with just resizing a vector to (size-1). The timing was the same: actually both approaches has the overall complexity of O(n log(n)) :)

bahvalo commented 1 year ago

With (std::fabs(e1.dx) > 1e+5 && std::fabs(e2.dx) > 1e+5) this test successfully passes, thank you.

But I observe another strange behavior.

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fenv.h>
#include <glob.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );

    Path64 a,b;
    ADD(a,-862500000000000128LL,-559553118062537408LL);
    ADD(a,-575000000000000256LL,-1119106236125074304LL);
    ADD(a,-274LL,-1119106236125074048LL);
    ADD(a,574999999999999872LL,-1119106236125068544LL);
    ADD(a,862500000000000128LL,-559553118062528896LL);
    ADD(a,1150000000000000000LL,5370LL);
    ADD(a,862500000000000128LL,559553118062536832LL);
    ADD(a,575000000000000256LL,1119106236125074304LL);
    ADD(a,274LL,1119106236125074432LL);
    ADD(a,-574999999999999872LL,1119106236125068544LL);
    ADD(a,-862500000000000128LL,559553118062528320LL);
    ADD(a,-1150000000000000000LL,-5961LL);
    ADD(b,-836534242197799424LL,-607292355260451712LL);
    ADD(b,-524505684941387712LL,-1149999999999999872LL);
    ADD(b,49775714785717888LL,-1117707644739548288LL);
    ADD(b,624057114512823424LL,-1085415289479091328LL);
    ADD(b,886309956983517184LL,-510415289479088832LL);
    ADD(b,1148562799454211072LL,64584710520908288LL);
    ADD(b,836534242197799424LL,607292355260451200LL);
    ADD(b,524505684941387648LL,1149999999999999872LL);
    ADD(b,-49775714785717928LL,1117707644739548672LL);
    ADD(b,-624057114512823424LL,1085415289479091328LL);
    ADD(b,-886309956983517184LL,510415289479088256LL);
    ADD(b,-1148562799454211072LL,-64584710520908880LL);

    for(int m=0; m<=20; m++) {
        for(size_t i=0; i<a.size(); i++) { a[i].x /= 2; a[i].y /= 2; }
        for(size_t i=0; i<b.size(); i++) { b[i].x /= 2, b[i].y /= 2; }

        Paths64 AA; AA.push_back(a);
        Paths64 BB; BB.push_back(b);
        Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);

        double S = 0.0;
        for(size_t j=0; j<solution.size(); j++) S += Area(solution[j]);
        printf("S = %e\n", S*(1<<m)*(1<<m));
    }
}

The result should be approximately the same for each m, because I just scale the polygons. However, I get S=9.541817e+35 for m=0,...,5 and S=9.520876e+35 for m=6,...,19. The last result seems to be correct.

Probably, this is another issue. I'm not sure.

AngusJohnson commented 1 year ago

I think we're almost there now ...

    Point64 GetIntersectPoint(const Active& e1, const Active& e2)
    {
        double b1, b2, q = (e1.dx - e2.dx);

        if (std::fabs(q) < 1e-5) 
            return GetEndE1ClosestToEndE2(e1, e2); //parallel ?? error
        else if (std::fabs(e1.dx) > 1e5)
        {
            Point64 result;
            result.y = (e1.bot.y + e1.top.y) / 2;
            b2 = e2.top.y * e2.dx - e2.top.x;
            result.x = static_cast<int64_t>(result.y * e2.dx - b2);
            return result;
        }
        else if (std::fabs(e2.dx) > 1e5)
        {
            Point64 result;
            result.y = (e2.bot.y + e2.top.y) / 2;
            b1 = e1.top.y * e1.dx - e1.top.x;
            result.x = static_cast<int64_t>(result.y * e1.dx - b1);
            return result;
        }
        else if (e1.dx == 0)
        {
            b2 = e2.bot.y - (e2.bot.x / e2.dx);
            return Point64(e1.bot.x,
                static_cast<int64_t>(std::round(e1.bot.x / e2.dx + b2)));
        }
        else if (e2.dx == 0)
        {
            b1 = e1.bot.y - (e1.bot.x / e1.dx);
            return Point64(e2.bot.x,
                static_cast<int64_t>(std::round(e2.bot.x / e1.dx + b1)));
        }
        else
        {
            b1 = e1.bot.x - e1.bot.y * e1.dx;
            b2 = e2.bot.x - e2.bot.y * e2.dx;

            q = (b2 - b1) / q;
            if (abs(e1.dx) < abs(e2.dx))
            {
                return Point64(static_cast<int64_t>(e1.dx * q + b1),
                    static_cast<int64_t>((q)));
            }
            else
            {
                return Point64(static_cast<int64_t>(e2.dx * q + b2),
                    static_cast<int64_t>((q)));
            }
        }
    }
bahvalo commented 1 year ago

Next.

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fenv.h>
#include <glob.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );

    Path64 a,b;
    ADD(a,862513556575282304LL,862497692244565504LL);
    ADD(a,575015864402403840LL,1149991727225358080LL);
    ADD(a,2162666575833LL,1149997833368111232LL);
    ADD(a,-575000958826813696LL,1149999999999998720LL);
    ADD(a,-862511678268535552LL,862510101406525312LL);
    ADD(a,-1150000000000000000LL,575005683626717760LL);
    ADD(a,-862496399093191040LL,287506358808189728LL);
    ADD(a,-574991769015936960LL,1092469489161LL);
    ADD(a,668207554827LL,3214670683646LL);
    ADD(a,574983678858186304LL,8822033569993LL);
    ADD(a,862466772717976960LL,287517953154307328LL);
    ADD(a,1149992271103332352LL,575005140416155520LL);
    ADD(b,862483595649010176LL,-287494701724283424LL);
    ADD(b,574983678858169408LL,8822033569993LL);
    ADD(b,668207554827LL,3214670683646LL);
    ADD(b,-574991769015928576LL,1092469491591LL);
    ADD(b,-862479576162166272LL,-287506296070403456LL);
    ADD(b,-1149987266264005248LL,-575000772389782592LL);
    ADD(b,-862492577664547712LL,-862499582618224000LL);
    ADD(b,-574998567007965248LL,-1149993459671747584LL);
    ADD(b,3472721893746LL,-1149995326993346048LL);
    ADD(b,575013251381925184LL,-1150000000000001152LL);
    ADD(b,862525149920304896LL,-862509393110609280LL);
    ADD(b,1150000000000000000LL,-574999583153962688LL);

    for(int m=0; m<=20; m++) {
        for(size_t i=0; i<a.size(); i++) { a[i].x /= 2; a[i].y /= 2; }
        for(size_t i=0; i<b.size(); i++) { b[i].x /= 2, b[i].y /= 2; }

        Paths64 AA; AA.push_back(a);
        Paths64 BB; BB.push_back(b);
        Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);

        double S = Area(solution);
        printf("S = %e\n", S*(1<<m)*(1<<m));
    }
}

For m=0,...,13, I get S=2.015106e+29; for m=14,...,20 I get S=0. The latter seems correct.

AngusJohnson commented 1 year ago

That one is OK.

When there's enough scaling the three almost horizontal vertices that were very slightly overlapping at full scale become identical at smaller scales (so they no longer overlap).

bahvalo commented 1 year ago

It is very possible that the full-scale polygons do overlap and the small-scale polygons do not. But when we move a single vertex, the area changes continuously. So in this case it should not be bigger than 1e24 (my algorithm gives the value 2.8e21, but it is not reliable). The value 2e29 is definitely in error.

Upd. Without recent patches, Clipper2 gives values from zero to 2.35e21. All these results I consider as correct.

AngusJohnson commented 1 year ago

It is very possible that the full-scale polygons do overlap and the small-scale polygons do not. But when we move a single vertex, the area changes continuously.

That's to be expected given the geometry of your polygons. The overlap region is enormously wide but has almost no height so any rounding that alters the height could almost double (or halve) the overlap area. And the effects of rounding (and variations in area measurement) will be most apparent just before your scaling loop returns an empty solution.

bahvalo commented 1 year ago

Let me explain what I expect from your library.

If an X or Y coordinate of a vertex is changed by 1, the area will change at most by 4.6e18, which is the maximal admissible coordinate value. In my example, we have 24 input vertices and some new vertexes appearing as results of intersection. All these vertexes coordinates are subject to rounding. So it is hardly possible to get the accuracy better than 1e20. I don't expect Clipper to compute the area of intersection with the accuracy 1e20, and the error like 1e23 seems acceptable for me. But with the latest code the error is 10^6 times bigger.

You are right that the effects of rounding are most apparent when the area of intersection is small. For instance, instead of 1e20 we can easily get 1e21 or zero. But this is only if we are looking at the relative values. Looking at the absolute error value of the intersection error, we will say that both 1e21 and zero are OK, but 2e29 is not OK.

AngusJohnson commented 1 year ago

If an X or Y coordinate of a vertex is changed by 1, the area will change at most by 4.6e18, which is the maximal admissible coordinate value.

I agree with you.

But with the latest code the error is 10^6 times bigger.

OK, I'll have another look 😁.

AngusJohnson commented 1 year ago

I've had another look and can't find a problem.

And here are the areas I get when running your very slightly modified test ...

2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
2.01511e+29
0
0
0
0
0
0
0

Perhaps you may have missed something in the numerous code iterations above. Or perhaps I missed documenting something. Anyhow, I've attached the amended code (together with your slightly modified test). Clipper2Lib_Test.zip

bahvalo commented 1 year ago

Yes, this is the result I get. Do you consider it to be correct?

If I remove the code if (std::fabs(e1.dx) > 1e5) ... else if (std::fabs(e2.dx) > 1e5) ... else, then I get

1.74797e+20
1.74797e+20
1.74797e+20
1.75947e+20
1.77097e+20
1.79397e+20
1.83997e+20
1.83997e+20
1.83996e+20
2.20795e+20
2.94393e+20
5.88786e+20
1.17757e+21
2.35515e+21
0
0
0
0
0
0
0

which seems perfectly fine to me. I don't know the exact area of intersection, but I guess that all the errors of the area evaluation do not exceed 1e22.

AngusJohnson commented 1 year ago

solution (without scaling): 287493241252219040,3009176063409, 287491839429084704,4411016784996, 334103777413,1607335341823 area= 2.01511e+29 (and area calculation verified at https://www.omnicalculator.com/math/irregular-polygon-area )

bahvalo commented 1 year ago

You are right, my considerations do not prove that the result 2.01e29 is wrong. I'm sure that it is wrong and will try to explain this later.

By the way, I constructed another example that generates FPE.

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fenv.h>
#include <glob.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );

    const long int M = 10000000;

    Path64 a,b;
    ADD(a,   0,    0);
    ADD(a, M*M,    2);
    ADD(a, M*M,  M*M);
    ADD(a,   0,  M*M);
    ADD(b,   0,   -1);
    ADD(b, M*M,    M);
    ADD(b, M*M, -M*M);
    ADD(b,   0, -M*M);

    Paths64 AA; AA.push_back(a);
    Paths64 BB; BB.push_back(b);
    Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);
    printf("S = %e\n", Area(solution));
}
AngusJohnson commented 1 year ago

Yeah, i need to test dx== 0 before almost everything else in GetIntersectPoint but it's bed time so more testing and the fix upload won't happen until tomorrow now.

AngusJohnson commented 1 year ago

Hopefully this is all fixed now 🤞.

bahvalo commented 1 year ago

My last case still generates FPE. Or throws an exception if CHECK_OVERFLOW is defined.

bahvalo commented 1 year ago

And I slightly modified the previous case. Now I scale only X.

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fenv.h>
#include <glob.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );

    Path64 a,b;
    ADD(a,862513556575282304LL,862497692244565504LL);
    ADD(a,575015864402403840LL,1149991727225358080LL);
    ADD(a,2162666575833LL,1149997833368111232LL);
    ADD(a,-575000958826813696LL,1149999999999998720LL);
    ADD(a,-862511678268535552LL,862510101406525312LL);
    ADD(a,-1150000000000000000LL,575005683626717760LL);
    ADD(a,-862496399093191040LL,287506358808189728LL);
    ADD(a,-574991769015936960LL,1092469489161LL);
    ADD(a,668207554827LL,3214670683646LL);
    ADD(a,574983678858186304LL,8822033569993LL);
    ADD(a,862466772717976960LL,287517953154307328LL);
    ADD(a,1149992271103332352LL,575005140416155520LL);
    ADD(b,862483595649010176LL,-287494701724283424LL);
    ADD(b,574983678858169408LL,8822033569993LL);
    ADD(b,668207554827LL,3214670683646LL);
    ADD(b,-574991769015928576LL,1092469491591LL);
    ADD(b,-862479576162166272LL,-287506296070403456LL);
    ADD(b,-1149987266264005248LL,-575000772389782592LL);
    ADD(b,-862492577664547712LL,-862499582618224000LL);
    ADD(b,-574998567007965248LL,-1149993459671747584LL);
    ADD(b,3472721893746LL,-1149995326993346048LL);
    ADD(b,575013251381925184LL,-1150000000000001152LL);
    ADD(b,862525149920304896LL,-862509393110609280LL);
    ADD(b,1150000000000000000LL,-574999583153962688LL);

    for(size_t i=0; i<a.size(); i++) { a[i].x /= 2; a[i].y /= 2; }
    for(size_t i=0; i<b.size(); i++) { b[i].x /= 2, b[i].y /= 2; }

    for(int m=0; m<=1; m++) {
        Paths64 AA; AA.push_back(a);
        Paths64 BB; BB.push_back(b);
        Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);

        double S = Area(solution);
        printf("S = %e\n", double(S)*(1<<m));

        for(size_t i=0; i<a.size(); i++) a[i].x /= 2;
        for(size_t i=0; i<b.size(); i++) b[i].x /= 2;
    }

}

Now it returns

S = 2.015106e+29
S = 0.000000e+00

This indicates that one of these results are in error.

AngusJohnson commented 1 year ago

Or throws an exception if CHECK_OVERFLOW is defined.

Strange. I wasn't getting overflow errors before but now I am? Work ... more work 😉.

Edit: I only started getting errors when I changed M's type to int64_t :
const int64_t M = 10000000; Evidently MSVC's long int is only 4 bytes.

AngusJohnson commented 1 year ago

Clipper2Lib_Test2.zip 🤞

bahvalo commented 1 year ago

Ok. Next. The following code generates an FPE.

#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fenv.h>
#include <glob.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    feenableexcept( FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW );

    Path64 a,b;
    ADD(a,5873531643786437LL,-10907856004334190LL);
    ADD(a,-572063233247808512LL,-10907856357909624LL);
    ADD(a,-861031614229043072LL,-295680892812377088LL);
    ADD(a,-1149999997847394560LL,-580453927885307520LL);
    ADD(a,-861031615255601024LL,-865226963295435776LL);
    ADD(a,-572063233981697600LL,-1149999998685980416LL);
    ADD(a,5873531569489655LL,-1149999999013396096LL);
    ADD(a,583810297809434496LL,-1150000000000004736LL);
    ADD(a,872778682431101440LL,-865226965266477056LL);
    ADD(a,1149999999999995904LL,-556893216519906624LL);
    ADD(a,855158084855340160LL,-260339823793244128LL);
    ADD(a,572063232808458240LL,12652856321515572LL);
    ADD(b,5873531342419534LL,1128184286593806848LL);
    ADD(b,-572063234332976384LL,1128184286913500800LL);
    ADD(b,-861031618484528512LL,843411252051976192LL);
    ADD(b,-1150000000000004352LL,558638215679646656LL);
    ADD(b,-861031617158135040LL,273865179743011296LL);
    ADD(b,-572063233247800064LL,-10907856357909624LL);
    ADD(b,5873531643786437LL,-10907856004334190LL);
    ADD(b,572063232808458240LL,12652856321515572LL);
    ADD(b,855158081926256640LL,309206248762144256LL);
    ADD(b,1110168977794405120LL,604014641445566208LL);
    ADD(b,813032149122702592LL,876134821681722496LL);
    ADD(b,543979277405183232LL,1149999999999995136LL);

    Paths64 AA; AA.push_back(a);
    Paths64 BB; BB.push_back(b);
    Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);
}
AngusJohnson commented 1 year ago

OK, bed time now, so tomorrow. Thanks for your patience and very helpful feedback.

AngusJohnson commented 1 year ago
#define CHECK_OVERFLOW  //define only when debugging

#ifdef CHECK_OVERFLOW

    static const char* overflow_error = "overflow error.";

    inline void CheckAdd(double val1, double val2)
    {
        if (val1 + val2 > LLONG_MAX) throw overflow_error;
        if (val1 + val2 < LLONG_MIN) throw overflow_error;
    }

    inline void CheckAdd(int64_t val1, double val2) 
    {
        CheckAdd(static_cast<double>(val1), val2);
    }

    inline void CheckMul(double val1, double val2)
    {
        if (val1 == 0 || val2 == 0) return;
        const double v1 = std::fabs(val1);
        const double v2 = std::fabs(val2);
        if (v1 > LLONG_MAX / v2) throw overflow_error;
        if (v1 < LLONG_MIN / v2) throw overflow_error;
    }

    inline void CheckMul(int64_t val1, double val2)
    {
        CheckMul(static_cast<double>(val1), val2);
    }
#endif

    bool GetIntersectPoint(const Active& e1, const Active& e2, Point64& ip)
    {
        // precondition: neither edge is horizontal
        //assert(!IsHorizontal(e1) && !IsHorizontal(e2));

        double abs_dx1 = std::fabs(e1.dx);
        double abs_dx2 = std::fabs(e2.dx);
        if (abs_dx1 < 1e-5)
        {
            if (abs_dx2 < 1e-5) return false; // parallel
            double b2 = e2.bot.y * e2.dx - e2.bot.x;
            ip.x = e1.bot.x;
#ifdef CHECK_OVERFLOW
            CheckMul(e1.bot.x + b2, 1 / e2.dx);
#endif
            ip.y = static_cast<int64_t>(std::round((e1.bot.x + b2) / e2.dx));
            return true;
        }
        else if (abs_dx2 < 1e-5)
        {
            double b1 = e1.bot.y * e1.dx - e1.bot.x;
            ip.x = e2.bot.x;
#ifdef CHECK_OVERFLOW
            CheckMul(e2.bot.x + b1, 1/e1.dx);
#endif
            ip.y = static_cast<int64_t>(std::round((e2.bot.x + b1) / e1.dx));
            return true;
        }       
        else if (abs_dx1 > 1e12)
        {
            ip.y = (e1.bot.y + e1.top.y) / 2;
            double b2 = e2.top.y * e2.dx - e2.top.x;
#ifdef CHECK_OVERFLOW
            CheckAdd(ip.y * e2.dx, -b2);
#endif
            ip.x = static_cast<int64_t>(ip.y * e2.dx - b2);
            return true;
        }
        else if (abs_dx2 > 1e12)
        {
            ip.y = (e2.bot.y + e2.top.y) / 2;
            double b1 = e1.top.y * e1.dx - e1.top.x;
#ifdef CHECK_OVERFLOW
            CheckAdd(ip.y * e1.dx, -b1);
#endif
            ip.x = static_cast<int64_t>(ip.y * e1.dx - b1);
            return true;
        }

        double q = (e1.dx - e2.dx);
        double  abs_q = std::fabs(q);
        if (abs_q < 1e-5) return false; //parallel

        double b1 = e1.bot.x - e1.bot.y * e1.dx;
        double b2 = e2.bot.x - e2.bot.y * e2.dx;

        if (abs_q < std::min(abs_dx1, abs_dx2))
        {
            // edges are closer to horizontal so 
            // greater accuracy to calc ip.x first
#ifdef CHECK_OVERFLOW
            CheckMul(b2 * e1.dx - b1 * e2.dx, 1/q);
#endif
            double x = (b2 * e1.dx - b1 * e2.dx) / q;
            ip.x = static_cast<int64_t>(x);
#ifdef CHECK_OVERFLOW
            CheckMul(x - b1, 1 / e1.dx);
#endif
            ip.y = static_cast<int64_t>((x - b1) / e1.dx);
        }
        else
        {
            // edges are closer to vertical so 
            // greater accuracy to calc ip.y first
#ifdef CHECK_OVERFLOW
            CheckMul(b2 - b1, 1/q);
#endif
            double y = (b2 - b1) / q;
            ip.y = static_cast<int64_t>(y);
            if (abs(e1.dx) < abs(e2.dx))
            {
#ifdef CHECK_OVERFLOW
                CheckAdd(e1.dx * y, b1);
#endif
                ip.x = static_cast<int64_t>(e1.dx * y + b1);
            }
            else
            {
#ifdef CHECK_OVERFLOW
                CheckAdd(e2.dx * y, b2);
#endif
                ip.x = static_cast<int64_t>(e2.dx * y + b2);
            }
        }
        return true;
    }

Note: this function can probably be further optimised and tidied (apart from removing all the overflow checks), but I'm hopeful it now covers all contingencies.

alexisnaveros commented 1 year ago

Alternatively, I think you could check the double value just prior to the conversion to int64_t (which is when the SIGFPE is raised). if( fabs( my_double ) >= (double)INT64_MAX ) then_we_have_no_intersection;

Or perhaps just clamp with my_double = fmax( fmin( my_double, (double)INT64_MAX ), (double)INT64_MIN ); prior to int64_t conversion (fmax/fmin are converted to a single instruction, it's branch-free).

AngusJohnson commented 1 year ago

Alternatively, I think you could check the double value just prior to the conversion to int64_t (

I'm hoping that won't be necessary (except while debugging).

AngusJohnson commented 1 year ago
#define CHECK_OVERFLOW  //define only when debugging

#ifdef CHECK_OVERFLOW

    static const char* overflow_error = "overflow error.";

    inline int64_t CheckCastInt64(double val)
    {
        if (val > LLONG_MAX) throw overflow_error;
        else if (val < LLONG_MIN) throw overflow_error;
        else return static_cast<int64_t>(val);
    }
#else
    inline int64_t CheckCastInt64(double val)
    {
        return static_cast<int64_t>(val);
    }
#endif

    bool GetIntersectPoint(const Active& e1, const Active& e2, Point64& ip)
    {
        // precondition: neither edge is horizontal
        //assert(!IsHorizontal(e1) && !IsHorizontal(e2));
        double abs_dx1 = std::fabs(e1.dx);
        double abs_dx2 = std::fabs(e2.dx);

        if (abs_dx1 < 1e-5)
        {
            if (abs_dx2 < 1e-5) return false; // parallel edges
            double b2 = e2.bot.y * e2.dx - e2.bot.x;
            ip.x = e1.curr_x;
            ip.y = CheckCastInt64(std::round((e1.curr_x + b2) / e2.dx));
            return true;
        }
        else if (abs_dx2 < 1e-5)
        {
            double b1 = e1.bot.y * e1.dx - e1.bot.x;
            ip.x = e2.curr_x;
            ip.y = CheckCastInt64(std::round((e2.curr_x + b1) / e1.dx));
            return true;
        }       

        double q = (e1.dx - e2.dx);
        if (std::fabs(q) < 1e-5) return false; //parallel

        double b1 = e1.bot.x - e1.bot.y * e1.dx;
        double b2 = e2.bot.x - e2.bot.y * e2.dx;

        if (std::min(abs_dx1, abs_dx2) > 1)
        {
            // both edges are closer to horizontal so 
            // it's better to calc ip.x first
            double x = (b2 * e1.dx - b1 * e2.dx) / q;
            ip.x = CheckCastInt64(x);
            if (abs(e1.dx) > abs(e2.dx))
                ip.y = CheckCastInt64((x - b1) / e1.dx);
            else
                ip.y = CheckCastInt64((x - b2) / e2.dx);
        }
        else
        {
            double y = (b2 - b1) / q;
            ip.y = CheckCastInt64(y);
            if (abs(e1.dx) < abs(e2.dx)) //one or other dx <= 1
                ip.x = CheckCastInt64(e1.dx * y + b1);
            else
                ip.x = CheckCastInt64(e2.dx * y + b2);
        }
        return true;
    }
rs0xFFFF commented 1 year ago

Time for _UI128_MAX processors ;-)

AngusJohnson commented 1 year ago

Time for _UI128_MAX processors ;-)

Somewhat surprisingly, even with overflow checking enabled, the performance cost of this extra code is negligible.

bahvalo commented 1 year ago

Now this throws an exception.

#include <stdio.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    Path64 a,b;
    ADD(a,809023171470172800LL,-874758197839316864LL);
    ADD(a,1114348780979194752LL,-579903279137305344LL);
    ADD(a,862499999997369344LL,-275241802164393664LL);
    ADD(a,574999999999326528LL,9806558268645062LL);
    ADD(a,366186LL,9806558269589536LL);
    ADD(a,-574999999999010880LL,9806558269994310LL);
    ADD(a,-862500000000128384LL,-275241802162350528LL);
    ADD(a,-1150000000000000000LL,-560290162594830272LL);
    ADD(a,-862500000001121664LL,-845338523027656960LL);
    ADD(a,-574999999999318144LL,-1130386883458883840LL);
    ADD(a,896526LL,-1130386883458421248LL);
    ADD(a,539348780981543424LL,-1149999999999990400LL);
    ADD(b,862500000000239872LL,294854918700700736LL);
    ADD(b,1150000000000000000LL,579903279134337024LL);
    ADD(b,862500000001376384LL,864951639568146688LL);
    ADD(b,575000000000395648LL,1150000000000009728LL);
    ADD(b,833390LL,1149999999999219456LL);
    ADD(b,-574999999998817280LL,1149999999998891776LL);
    ADD(b,-862499999997445120LL,864951639566450560LL);
    ADD(b,-1149999999997548288LL,579903279134568320LL);
    ADD(b,-862499999997262080LL,294854918702743936LL);
    ADD(b,-574999999999010880LL,9806558269994310LL);
    ADD(b,366186LL,9806558269589536LL);
    ADD(b,574999999999322368LL,9806558268645062LL);

    Paths64 AA; AA.push_back(a);
    Paths64 BB; BB.push_back(b);
    Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);
    double S = Area(solution);
    printf("S = %e\n", double(S));
}
rs0xFFFF commented 1 year ago

Somewhat surprisingly, even with overflow checking enabled, the performance cost of this extra code is negligible. That would be the disadvantage, because the 128-bit arithmetic is emulated on a 64-bit processors. And in Delphi there is no 128 bit type (yet).

AngusJohnson commented 1 year ago
    bool GetIntersectPoint(const Active& e1, const Active& e2, Point64& ip)
    {
        // precondition: neither edge is horizontal
        //assert(!IsHorizontal(e1) && !IsHorizontal(e2));
        static const double parallel_tolerance = 1.0e-5;
        static const double vertical_tolerance = 1.0e-5;

        double abs_dx1 = std::fabs(e1.dx);
        double abs_dx2 = std::fabs(e2.dx);

        if (abs_dx1 < vertical_tolerance)
        {
            if (abs_dx2 < parallel_tolerance) return false;
            double b2 = e2.bot.y * e2.dx - e2.bot.x;
            ip.x = e1.curr_x;
            ip.y = CheckCastInt64(std::round((e1.curr_x + b2) / e2.dx));
            return true;
        }
        else if (abs_dx2 < vertical_tolerance)
        {
            double b1 = e1.bot.y * e1.dx - e1.bot.x;
            ip.x = e2.curr_x;
            ip.y = CheckCastInt64(std::round((e2.curr_x + b1) / e1.dx));
            return true;
        }       

        double q = (e1.dx - e2.dx);
        double abs_q = std::fabs(q);

        double b1 = e1.bot.x - e1.bot.y * e1.dx;
        double b2 = e2.bot.x - e2.bot.y * e2.dx;
        double min_dx = std::min(abs_dx1, abs_dx2);

        if (min_dx > 1) // better to calc ip.x before ip.y
        {
            if (abs_q < parallel_tolerance * min_dx) return false;

            double x = (b2 * e1.dx - b1 * e2.dx) / q;
            ip.x = CheckCastInt64(x);
            if (abs_dx1 > abs_dx2)
                ip.y = CheckCastInt64((x - b1) / e1.dx);
            else
                ip.y = CheckCastInt64((x - b2) / e2.dx);
        }
        else
        {
            if (abs_q < parallel_tolerance) return false;

            double y = (b2 - b1) / q;
            ip.y = CheckCastInt64(y);
            if (abs(e1.dx) < abs(e2.dx)) //one or other dx <= 1
                ip.x = CheckCastInt64(e1.dx * y + b1);
            else
                ip.x = CheckCastInt64(e2.dx * y + b2);
        }
        return true;
    }
bahvalo commented 1 year ago

I can't compile the code, CheckCastInt64 undefined

AngusJohnson commented 1 year ago
#define CHECK_OVERFLOW  //define only when debugging

inline int64_t CheckCastInt64(double val)
{
#ifdef CHECK_OVERFLOW
    if (val > LLONG_MAX || val < LLONG_MIN) throw "overflow error.";
#endif
    return static_cast<int64_t>(val);
}
bahvalo commented 1 year ago

Ok. Now the inverse situation with almost overlapping polygons.

#include <stdio.h>
#include "clipper2/clipper.h"
using namespace Clipper2Lib;

#define ADD(A,X,Y) A.push_back(Point64(int64_t(X), int64_t(Y)))

int main(int, char**) {
    Path64 a,b;

    ADD(a,890236108478911616LL,-600678594654510336LL);
    ADD(a,1149040936202350208LL,-665484099150913LL);
    ADD(a,841249898247169152LL,586996692971884928LL);
    ADD(a,566116333779815360LL,1149957003074273664LL);
    ADD(a,15849204845108742LL,1125255136105704320LL);
    ADD(a,-567075397577446400LL,1125255136105665408LL);
    ADD(a,-858537698788722944LL,549943892518953088LL);
    ADD(a,-1150000000000000512LL,-25367351067798084LL);
    ADD(a,-858537698788722944LL,-600678594654510336LL);
    ADD(a,-646239222076756480LL,-1149999999999980544LL);
    ADD(a,-102896531903854784LL,-1137005080879301120LL);
    ADD(a,519609982768333504LL,-1149999999999980544LL);
    ADD(b,891197804795068672LL,-600633750840939648LL);
    ADD(b,1149999999999999360LL,-619165597971545LL);
    ADD(b,842206383722243840LL,587041257642480640LL);
    ADD(b,567070349309609024LL,1150000000000019456LL);
    ADD(b,16803328759483116LL,1125294997556221952LL);
    ADD(b,-566121273655786048LL,1125291675995182336LL);
    ADD(b,-857581050725848192LL,549978771635127808LL);
    ADD(b,-1149040827795910528LL,-25334132724926740LL);
    ADD(b,-857576002450706816LL,-600643715523980544LL);
    ADD(b,-645275115632407808LL,-1149963911165188608LL);
    ADD(b,-101932482480590576LL,-1136965896025245440LL);
    ADD(b,520574089198109120LL,-1149957268043187200LL);

    for(int m=0; m<=20; m++) {
        Paths64 AA; AA.push_back(a);
        Paths64 BB; BB.push_back(b);
        Paths64 solution = Intersect(Paths64(AA), Paths64(BB), FillRule::NonZero);

        const double M = pow(4.,m);
        printf("S = %e, SA-S = %e, SB-S = %e\n", Area(solution)*M, (Area(a)-Area(solution))*M, (Area(b)-Area(solution))*M);

        for(size_t i=0; i<a.size(); i++) a[i].x /= 2;
        for(size_t i=0; i<b.size(); i++) b[i].x /= 2;
        for(size_t i=0; i<a.size(); i++) a[i].y /= 2;
        for(size_t i=0; i<b.size(); i++) b[i].y /= 2;
    }
}

I get S = 3.942184e36 for some m and S = 3.941915e36 for other m. The latter result is correct.

AngusJohnson commented 1 year ago

That one looks OK to me. (The relatively small variations I would attribute to rounding.)

bahvalo commented 1 year ago

You may replace Intersect by Difference. Then you'll get 1.95e33 and 2.22e33 (10% error).

AngusJohnson commented 1 year ago

Yes, but given that these non-overlapping regions are extremely long and thin, that's not unexpected.

bahvalo commented 1 year ago

Did I understand correctly that Clipper is not intended to obtain an accurate value of the intersection area (say, with the accuracy 1e24)?

For my purpose, it will be better to get more accurate results compromising the speed (say, using 128-bit arithmetic model). But I understand that there may be different applications with different criteria.

In all cases, thank you for your library. Even if you consider the accuracy of 2e33 to be tolerable, your library is helpful for me.

alexisnaveros commented 1 year ago

Note that computing the Area() with the current code is extremely unreliable, as you may be accumulating a ton of values with wildly different magnitudes, and signs. Though that error should be small in the scenarios above just because your lists of points are short.

If you want an accurate Area(), you are going to need a different approach. The same algorithm could be kept by switching to Shewchuk summation, for example.

alexisnaveros commented 1 year ago

Little follow-up, I quickly passed your input through my version which does Shewchuk summation and the Area() difference is negligible. The whole difference comes from the list of solution vertices (I guess we already knew that, it's now confirmed).

alexisnaveros commented 1 year ago

In case it helps, I traced the exact origin of the discrepancy (I get ridiculously verbose debugging by turning on a #define).

When the result is "correct" (area S = 3.94191491330415002259e+36), then there's an intersection point -1105741767893338 2197763937706377 that flies unchanged through AddNewIntersectNode(), because top_y is also 2197763937706377, matching pt.y

When the result is "bad" (area S = 3.94218403115878237993e+36), then there's an intersection point -2211483535786677 4395527875412754 that's being modified in AddNewIntersectNode(), more precisely because top_y is 4395527875412755 there (entering else if (pt.y < top_y)), and execution goes through the else if (e2.top.y == top_y) branch, which makes that intersection point snap to the vertex #5 of input contour AA.

Adding some missing nearbyint()/round() calls in GetIntersectPoint() helps a little bit; there are more "correct" areas in the list of 21 tests, though that's not the complete answer.