elki-project / elki

ELKI Data Mining Toolkit
https://elki-project.github.io/
GNU Affero General Public License v3.0
780 stars 321 forks source link

Geodetic mindist #29

Closed ghost closed 7 years ago

ghost commented 7 years ago

Hello,

i was reading your paper "Geodetic Distance Queries on R-Trees for Indexing Geographic Data" and i wasn´t sure what did you mean by all those subscript 360 functions, so i checked your code for that defined here https://github.com/elki-project/elki/blob/e8f3c6fdf54e1e0aa8444b94ad5374ad518dfc0c/elki-core-math/src/main/java/de/lmu/ifi/dbs/elki/math/geodesy/SphereUtil.java, it doesn´t follow the pseudocode. For instance lines 806-813

// Determine whether going east or west is shorter.
    double lngE = rminlng - plng;
    if(lngE < 0) {
      lngE += MathUtil.TWOPI;
    }
    double lngW = rmaxlng - plng; // we keep this negative!
    if(lngW > 0) {
      lngW -= MathUtil.TWOPI;
}

where i think you meant to do

//mod360(rminlng-plng) <= mod360(plng-rmaxlng)
// Determine whether going east or west is shorter.
        double lngW = fmod(rminlng - plng, 2 * M_PI);
        if (lngW < 0)
        {
            lngW += 2 * M_PI;
        }
        double lngE = fmod(plng - rmaxlng, 2 * M_PI);
        if (lngE < 0)
        {
            lngE += 2 * M_PI;
}

Also my theory is that 360 subscript means we are working in radians, not degrees. In fact when i followed your pseudocode in paper exactly it is working for me in C++, but i am still not sure what you meant by sentence "In order to distinguish the other cases, we first need to test whether we are on the left or on the right side by rotating the mean longitude of the rectangle by 180◦ – the meridian opposite of the rectangle." . Do i understand correctly, that you wanted to do modulo 360(2 PI) to normalize the angle to range 0-360 [0..2 PI]?

Also i have found out,that the haversine formula should use absolute difference of latitudes and longitudes. Correct me if i am wrong,but if you dont use absolute values there,you may get negative distances. I attach my version of the algorithm including the Haversine formula

static const double wgs84_radius_m = 6378137;
static const double wgs84_flattening = 1.0 / 298.257223563;
static const double earth_radius_m = wgs84_radius_m * (1 - wgs84_flattening);
static const double earth_radius_km = earth_radius_m / 1000.0;
 struct WGS84Mindist
{
    //input is in radians
    //returns distance in kilometers
    static double haversineFormulaRad(double lat1, double lon1, double lat2, double lon2)
    {
        double d_lat = abs(lat1 - lat2);
        double d_lon = abs(lon1 - lon2);

        double a = pow(sin(d_lat / 2), 2) + cos(lat1) * cos(lat2) * pow(sin(d_lon / 2), 2);

        //double d_sigma = 2 * atan2(sqrt(a), sqrt(1 - a));
        double d_sigma = 2 * asin(sqrt(a));

        return earth_radius_km * d_sigma;
    }

    //input is in radians
    //returns angular distance on unit sphere
    static double haversineFormulaRadAngular(double lat1, double lon1, double lat2, double lon2)
    {
        double d_lat = abs(lat1 - lat2);
        double d_lon = abs(lon1 - lon2);

        double a = pow(sin(d_lat / 2), 2) + cos(lat1) * cos(lat2) * pow(sin(d_lon / 2), 2);

        //double d_sigma = 2 * atan2(sqrt(a), sqrt(1 - a));
        double d_sigma = 2 * asin(sqrt(a));

        return d_sigma;
    }

    //input is in radians
    //returns angle in radians
    static double getBearingRad(double lat1, double lon1, double lat2, double lon2)
    {
        double dLon = lon2 - lon1;
        double y = sin(dLon) * cos(lat2);
        double x = cos(lat1) * sin(lat2) - sin(lat1) * cos(lat2) * cos(dLon);
        double radiansBearing = atan2(y, x);

        return radiansBearing;
    }

    //start, end, point - coordinates in radians
    static double getCrossTrackDistanceRad(double lat1, double lon1, double lat2, double lon2, double lat3, double lon3)
    {
        double angDist1Q = haversineFormulaRadAngular(lat1, lon1, lat3, lon3);

        double cos_lat1 = cos(lat1);
        double sin_lat1 = sin(lat1);
        double cos_lat3 = cos(lat3);
        double cos_lat2 = cos(lat2);

        //double theta1Q = getBearing(lat1, lon1, lat3, lon3);
        // double dLon = lon3 - lon1;
        // double y = sin(dLon) * cos(lat3);
        // double x = cos(lat1) * sin(lat3) - sin(lat1) * cos(lat3) * cos(dLon);
        // double radiansBearing = atan2(y, x);
        double dLon1 = lon3 - lon1;
        double y1 = sin(dLon1) * cos_lat3;
        double x1 = cos_lat1 * sin(lat3) - sin_lat1 * cos_lat3 * cos(dLon1);
        double theta1Q = atan2(y1, x1);

        //double theta12 = getBearing(lat1, lon1, lat2, lon2);
        // double dLon = lon2 - lon1;
        // double y = sin(dLon) * cos(lat2);
        // double x = cos(lat1) * sin(lat2) - sin(lat1) * cos(lat2) * cos(dLon);
        // double radiansBearing = atan2(y, x);
        double dLon2 = lon2 - lon1;
        double y2 = sin(dLon2) * cos_lat2;
        double x2 = cos_lat1 * sin(lat2) - sin_lat1 * cos_lat2 * cos(dLon2);
        double theta12 = atan2(y2, x2);

        return asin(sin(angDist1Q) * sin(theta1Q - theta12)) * earth_radius_km;
    }

    static double latlngMinDistDeg(double &plat, double &plng, double &rminlat, double &rminlng, double &rmaxlat, double &rmaxlng)
    {
        return latlngMinDistRad(GeoDistance::deg2rad(plat), GeoDistance::deg2rad(plng),
                                GeoDistance::deg2rad(rminlat), GeoDistance::deg2rad(rminlng),
                                GeoDistance::deg2rad(rmaxlat), GeoDistance::deg2rad(rmaxlng));
    }

    //returns distance in kilometers
    static double latlngMinDistRad(double plat, double plng, double rminlat, double rminlng, double rmaxlat, double rmaxlng)
    {
        // The simplest case is when the query point is in the same "slice":
        if (rminlng <= plng && plng <= rmaxlng)
        {
            if (plat < rminlat)
            {
                return (rminlat - plat) * earth_radius_km; //Sout of MBR
            }
            else if (plat > rmaxlat)
            {
                return (plat - rmaxlat) * earth_radius_km; //North of MBR
            }
            return 0.0; // INSIDE MBR
        }

        // Determine whether going east or west is shorter.
        double lngW = fmod(rminlng - plng, 2 * M_PI);
        if (lngW < 0)
        {
            lngW += 2 * M_PI;
        }
        double lngE = fmod(plng - rmaxlng, 2 * M_PI);
        if (lngE < 0)
        {
            lngE += 2 * M_PI;
        }

        if (lngW <= lngE)
        {
            // West of MBR
            double tau = tan(plat);

            if (lngW >= 0.5 * M_PI) // Large delta of longitude
            {
                if (tau <= tan((rmaxlat + rminlat) * 0.5) * cos(rminlng - plng))
                {
                    return haversineFormulaRad(plat, plng, rmaxlat, rminlng); //North-West
                }
                else
                {
                    return haversineFormulaRad(plat, plng, rminlat, rminlng); //South-West
                }
            }

            if (tau >= tan(rmaxlat) * cos(rminlng - plng))
            {
                return haversineFormulaRad(plat, plng, rmaxlat, rminlng); //North-West
            }

            if (tau <= tan(rminlat) * cos(rminlng - plng))
            {
                return haversineFormulaRad(plat, plng, rminlat, rminlng); //South-West
            }

            return abs(getCrossTrackDistanceRad(rminlat, rminlng, rmaxlat, rminlng, plat, plng)); // West
        }     
        else
        {
             // East of MBR
             double tau = tan(plat);

             if (lngE >= 0.5 * M_PI) // Large delta of longitude
             {
                 if (tau <= tan((rmaxlat + rminlat) * 0.5) * cos(rmaxlng - plng))
                 {
                    return haversineFormulaRad(plat, plng, rmaxlat, rmaxlng); //North-East
                 }
                 else
                 {
                    return haversineFormulaRad(plat, plng, rminlat, rmaxlng); //Sout-East
                 }
             }

             if (tau >= tan(rmaxlat) * cos(rmaxlng - plng))
             {
                 return haversineFormulaRad(plat, plng, rmaxlat, rmaxlng); //North-East
             }

             if (tau <= tan(rminlat) * cos(rmaxlng - plng))
             {
                 return haversineFormulaRad(plat, plng, rminlat, rmaxlng); //Sout-East
             }

             return abs(getCrossTrackDistanceRad(rmaxlat, rmaxlng, rminlat, rmaxlng, plat, plng)); // East
        }
    }
};

The last thing i dont understand why in pseudocode you are returning c (rminlat-plat)/360 for the N/S case when i think it should be c (rminlat-plat) , because you say in text we are calculating length of meridian arc, where c is radius of earth.

Thank you for your answer.

kno10 commented 7 years ago

Can you provide a test case? Much of the code could use better unit tests. But I have not noticed any negative distances. I have seen near-zero negative results in other distances, because of numerical instability. Given that we used the code to produce the figures (in particular the textured overlays), I would not expect there is an as obvious bug in there, it would cause larger artifacts.

From a quick look,

double lngW = rmaxlng - plng; // we keep this negative!
...
if(lngE <= -lngW) {

the comment indicates that this was a deliberate choice, not an accident, to use the negative difference here. The following if statement then compares to the negative value, so that looks consistent to me. But I do not remember why this seemed beneficial.

When the pseudocode says "mod360", this of course will usually be implemented as a modulo 2pi in radians etc. - pseudocode is about the idea, and mod360 = "handle the wrap-around here if necessary". Whether you use radians or degree is up to you; insert conversions as necessary. Rather than just trying to translate the pseudocode into code, I recommend to try to understand the logic. It's not very difficult; there are nine cases, and the paper illustrates this with figures. We could be inside the rectangle, north, south, closest to the four corners, or it might be best to choose the optimal bearing to "fly" to the east or west side of the rectangle. Much of the code which you are referring to are simply case distinctions whether it is faster to fly east, or to fly west.

Java does benefit from some careful optimization of functions that are in hot loops such as distance computations. Unfortunately, Java for example does not have a sincos function, and in Java, -1.5 % Math.PI == -1.5. This behavior makes sense too (and in fact a popular pitfall when using % in Java with negative numbers - see MathUtil.normAngle), but is likely different from what you were expecting. Your code indicates you did not understand how we implemented the modulo in Java (knowing that we won't see an angle like 12345, for which we would need to call normAngle, and the code does not support rectangles crossing the date line - some simple translation could be used to handle this.)

Haversine: sin(x)^2 = sin(-x)^2, isn't it? Where do you think it needs an absolute value?

So: do you know any input where it does not work right?

If I have time, I will try to revisit the negative-lngW decision, and there may be some low-hanging optimization potential with MathUtil.normAngle, too, to make ELKI faster.

ghost commented 7 years ago

Haversine: sin(x)^2 = sin(-x)^2, isn't it? Where do you think it needs an absolute value?

I got the formulas for Vincenty and Haversine from https://en.wikipedia.org/wiki/Great-circle_distance, but you are right, in the case of Haversine that equality holds so it doesn´t matter if i make absolute difference or normal difference, but in the case of Vincenty you need the absolute difference, maybe that´s why they put it in general to use absolute difference for both formulas.

Can you provide a test case? Much of the code could use better unit tests. But I have not noticed any negative distances. I have seen near-zero negative results in other distances, because of numerical instability. Given that we used the code to produce the figures (in particular the textured overlays), I would not expect there is an as obvious bug in there, it would cause larger artifacts.

I will try to write some simple tests for all the 9 cases of positions between point and MBR, but since you mentioned i should be trying to understand the point,that´s what i am trying to do and i would like to ask you if you be so kind and tell me if i understand it correctly, because i am still a little bit confused even if it seems so simple to you and tell me if the code i provided is correct. I have used the plate carree projection you mentioned in text and draw myself a rectangle which maps longitudes as X coordinates and latitudes as Y and i marked the points in the picture. untitled 1) I understand that here you can just compare the coordinates, because the north and south edges of rectangle are parallels which are perpendicular to meridians and the shortest distance is equal to the length of the meridian arc starting at point Q and ending at point lying on the northern or southern edge. The length of the meridian of the arc can be calculated as R (deltaLatitude), where R is the radius of Earth. This is what i have done in the first condition in my code. In pseudocode you use circumference of Earth which is defined if i am not wrong as c=2 PI R and in pseudocode the resulting distance is computed as c (deltaLatitude) / 360, but that is the same formula i used,because formula in text is (2 PI R deltaLatitude ) / 360 conversion to rad: 1 degree = PI / 180 means we multiply above by 180/PI [(2 PI R deltaLatitude) / 360 ] (180/PI) = (2 R deltalatitude) / 2 = R * deltaLatitude

// The simplest case is when the query point is in the same "slice":
        if (rminlng <= plng && plng <= rmaxlng)
        {
            if (plat < rminlat)
            {
                return (rminlat - plat) * earth_radius_km; //Sout of MBR
            }
            else if (plat > rmaxlat)
            {
                return (plat - rmaxlat) * earth_radius_km; //North of MBR
            }
            return 0.0; // INSIDE MBR
        }
  1. So first we need to determine if the angle difference to the west is lower than angle difference to the east,because all of the computations use Haversine formula which value is also dependent on the value of longitude difference

    // Determine whether going east or west is shorter.
        double lngW = fmod(rminlng - plng, 2 * M_PI);
        if (lngW < 0)
        {
            lngW += 2 * M_PI;
        }
        double lngE = fmod(plng - rmaxlng, 2 * M_PI);
        if (lngE < 0)
        {
            lngE += 2 * M_PI;
    }

    We determine the other cases based on the value of azimuth. Since we are using the north-based azimuth which has value 0 at prime meridian and the values are measured clockwise, for north-east corner and south-east corner of rectangle we need to compare the azimuth(point,reactangle) with 90 degrees and for the north-west and south-west corners we need to compare the azimuth(point,rectangle) with 270 degrees as depicted on the picture in radians, where the values are actually values of azimuths at rectangle corners. So if it lies in NE or SE, then the azimuth must be <=90 or >= 90 degrees respectively. If that is not that case, then we calculate the distance of point from great circle path(because path from NE corner to SE corner lies on great circle), so we use the formula for cross-track error. Similar situation for the opposite areas.

  2. For the optimised pseudocode you derived a formula tan(latQ) <= tan(latR) cos(lonR-lonQ) to determine if the point lies south from the great circle path going through one of the corners of rectangle and otherwise if tan(latQ) >= tan(latR) cos(lonR-lonQ) then the point lies north from the this path. This check is valid for cases where difference of longitude is less than 90 degrees.

There are 2 cases -> point can be in the middle of two meridians each crossing upper and lower point of rectangle. This is the case when longitude difference is larger than 90 degrees and we know that at this difference they intersect. So you compare to the line going through middle of MBR and determine which rectangle point is closest to this line,hence the checks. tan(latQ) <= tan(latminR+latmaxR/2) * cos(lonR-lonQ) Here i dont know why there is < = instead of >= as i think it should be. Is there a mistake in pseudocode here or am i somewhere wrong?

-> point is closer to the rectangle than 90 degrees of longitude,then we compare to the actuall lines

// West of MBR
            double tau = tan(plat);

            if (lngW >= 0.5 * M_PI) // Large delta of longitude
            {
                if (tau <= tan((rmaxlat + rminlat) * 0.5) * cos(rminlng - plng))
                {
                    return haversineFormulaRad(plat, plng, rmaxlat, rminlng); //North-West
                }
                else
                {
                    return haversineFormulaRad(plat, plng, rminlat, rminlng); //South-West
                }
            }

            if (tau >= tan(rmaxlat) * cos(rminlng - plng))
            {
                return haversineFormulaRad(plat, plng, rmaxlat, rminlng); //North-West
            }

            if (tau <= tan(rminlat) * cos(rminlng - plng))
            {
                return haversineFormulaRad(plat, plng, rminlat, rminlng); //South-West
            }

Otherwise it can be computed as cross-track distance,because in the projected rectangle point should lie at line crossing the middle of MBR. return abs(getCrossTrackDistanceRad(rminlat, rminlng, rmaxlat, rminlng, plat, plng)); // West

Do i understand correctly?

If I have time, I will try to revisit the negative-lngW decision, and there may be some low-hanging optimization potential with MathUtil.normAngle, too, to make ELKI faster.

I have checked your code again,by doing that strange comparison when you end up here

// East, to min edge:
    if(lngE <= -lngW) {
      final double clngD = FastMath.cos(lngE);
      final double tlatQ = FastMath.tan(plat);
      if(lngE > MathUtil.HALFPI) {
        final double tlatm = FastMath.tan((rmaxlat + rminlat) * .5);
        if(tlatQ >= tlatm * clngD) {
          return haversineFormulaRad(plat, plng, rmaxlat, rminlng);
        }
        else {
          return haversineFormulaRad(plat, plng, rminlat, rminlng);
        }
}

you are actually west to the MBR,not east as in comment, because in formulas you compare to correct corners, ie. (rmaxlat,rminlong) is upper left corner and (rminlat,rminlng) is lower left corner,you can also check that with picture i provided. So that is OK,because you compare to correct corners here too

else { // West, to max edge:
      final double clngD = FastMath.cos(lngW);
      final double tlatQ = FastMath.tan(plat);
      if(-lngW > MathUtil.HALFPI) {
        final double tlatm = FastMath.tan((rmaxlat + rminlat) * .5);
        if(tlatQ >= tlatm * clngD) {
          return haversineFormulaRad(plat, plng, rmaxlat, rmaxlng);
        }
        else {
          return haversineFormulaRad(plat, plng, rminlat, rmaxlng);
        }
}

Last thing i don´t understand that you have in your code a nice formula to compute cross-track distance called crossTrackDistanceRad,yet you dont use it and use some custom formula instead

 // Cross-track-distance to longitude line.
      final double slngD = MathUtil.cosToSin(lngW, clngD);
return Math.asin(-FastMath.cos(plat) * slngD);
kno10 commented 7 years ago

Sorry, I don't have time to review your code. :-( Use tests and visualization for validation.

Based on your figure, it seems your longitude is 0 to 2pi and increases westwards? I'd use -pi to +pi, and eastwards increasing, which seems to be the usual coordinate system in WGS98 etc. - or what do you mean by the red 3/2 pi in your plot? So maybe your east-west coordinate system is reversed? Also a reason to not use fmod(x, 2*PI), because it "breaks" the usual convention of using -180:180~-pi:pi for coordinates!

Avoid using the plate carree view; it is misleading. The arrows in your figure are not proper: the shortest path from A to B with lat(A)=lat(B) usually is not to fly along the circle of latitude. As a rule of thumb, the only circle of latitude that you should be using is the equator, avoid any other "horizonal" line. Figures 4b and 6 is how the regions look like in plate caree; and Fig 4c is what it looks like on a sphere. And don't forget the distinction at the opposite side of the earth, too (where the orange line is maximally far from the equator)! Also try to reproduce Figure 7a; and note that it is not symmetric at +-180° if your rectangle isn't centered, to test your code.

Avoid recomputing sin or cos of the same values for performance. if possible, use sincos to compute both at once (with C, this is "buy one, get one free"). If you manually inline the crossTrack function and optimize away duplicate computations (the Java compiler can't do this, and I'm not sure if your C compiler would be able to recognize that cos(x) can be reused if x has not changed!). Trigonometric computations are quite expensive (200-250 clock ticks each), so you want to use them only as necessary. Otherwise, you could just use min(distance-to-corners, distance-to-sides) if you don't care about speed anyway.

ghost commented 7 years ago

Based on your figure, it seems your longitude is 0 to 2pi and increases westwards? I'd use -pi to +pi, and eastwards increasing, which seems to be the usual coordinate system in WGS98 etc. - or what do you mean by the red 3/2 pi in your plot? So maybe your east-west coordinate system is reversed?

The lines depict the values of north-based azimuth from corners of rectangle to points with same latitude.

untitled You will help me,if you tell me if your pseudocode in text is correct. Shouldn´t the signs when determining the north/south corner from the middle of MBR be the same as when comparing to the line?

ghost commented 7 years ago

So okay, you say the angles after mod360 should be in range -pi,+pi, what do you think about the solution presented here http://stackoverflow.com/questions/11498169/dealing-with-angle-wrap-in-

Normalize to [-180,180):

double constrainAngle(double x){
    x = fmod(x + 180,360);
    if (x < 0)
        x += 360;
    return x - 180;
}
kno10 commented 7 years ago

Sorry, I am really too busy to double-check right now. Use tests, these cases can be verify by such tests easily (distance to corner, that is an easy case).

I am not a fan of copy & paste... and the ELKI code already does this similar (but with radians, without modulo).

ghost commented 7 years ago

Hello, i have some trouble at start with how to imagine the cases on Earth,but then i found a great picture where i could see where the points actually are and i made a few test cases. mindist_test Based on the test cases i was able to find mistake in the pseudocode from the text and i provide my code which works now. The sign was inconsistent with the formula you derived which says that point lies to north if

tan(latQ) >= tan(latR) * cos(lonR-lonQ)

Only thing that i have actually modified from the previous code i had was changing both the conditions for North-West and North-East case from

 if (tau <= tan((rmaxlat + rminlat) * 0.5) * cos(rminlng - plng)) //North-West
...
 if (tau <= tan((rmaxlat + rminlat) * 0.5) * cos(rmaxlng - plng)) //North-East
...

to

 if (tau >= tan((rmaxlat + rminlat) * 0.5) * cos(rminlng - plng)) //North-West
...
if (tau >= tan((rmaxlat + rminlat) * 0.5) * cos(rmaxlng - plng)) //North-East
...

You said that you increment eastward, but you wrote yourself bad comments in the code.Based on the image i was able to find out why you do crazy things on these lines

double lngE = rminlng - plng;
    if(lngE < 0) {
      lngE += MathUtil.TWOPI;
}
 double lngW = rmaxlng - plng; // we keep this negative!
   if(lngW > 0) {
      lngW -= MathUtil.TWOPI;
}

where you should be actually doing

  double lngW=rminlng - plng ; //because rminlng is at west corner of rectangle and rminlng is further away from plng
    if(lngW < 0) {
      lngW += MathUtil.TWOPI;
}
  double lngE = plng - rmaxlng; //becase rmaxlng is east corner and plng is further away from rmaxlng
 if(lngE < 0) {
      lngE += MathUtil.TWOPI;
}

Since you have those directions exactly the opposite as they are,when you do check

 // East, to min edge:
if(lngE <= -lngW) {

you actually check this

 // West, to min edge:
if(lngW <= lngE) {
...
//because later you calculate distance from point to upper left corner
//so you are lucky you made two mistakes that made the code work
return haversineFormulaRad(plat, plng, rmaxlat, rminlng); 

I don´t know how it works with research papers,but you should fix the sign from <= to >= in those two conditions and also fix some typos in text and resubmit to Springer, so it is consistent with the text. I provide working code with the test cases for reference.

static const double wgs84_radius_m = 6378137;
static const double wgs84_flattening = 1.0 / 298.257223563;
static const double earth_radius_m = wgs84_radius_m * (1 - wgs84_flattening);
static const double earth_radius_km = earth_radius_m / 1000.0;
//these computational formulas assume spherical model fo earth
//we use semi-minor axis of WGS84 spheroid model as earth radius to obtain a slightly less tight minimum distance. This will underestimate the lower bound by at most 0.3%.
struct WGS84Mindist
{
    //input is in radians
    //returns distance in kilometers
    static double haversineFormulaRad(double lat1, double lon1, double lat2, double lon2)
    {
        double d_lat = abs(lat1 - lat2);
        double d_lon = abs(lon1 - lon2);

        double a = pow(sin(d_lat / 2), 2) + cos(lat1) * cos(lat2) * pow(sin(d_lon / 2), 2);

        //double d_sigma = 2 * atan2(sqrt(a), sqrt(1 - a));
        double d_sigma = 2 * asin(sqrt(a));

        return earth_radius_km * d_sigma;
    }

    //input is in radians
    //returns angular distance on unit sphere
    static double haversineFormulaRadAngular(double lat1, double lon1, double lat2, double lon2)
    {
        double d_lat = abs(lat1 - lat2);
        double d_lon = abs(lon1 - lon2);

        double a = pow(sin(d_lat / 2), 2) + cos(lat1) * cos(lat2) * pow(sin(d_lon / 2), 2);

        //double d_sigma = 2 * atan2(sqrt(a), sqrt(1 - a));
        double d_sigma = 2 * asin(sqrt(a));

        return d_sigma;
    }

    //input is in radians
    //returns angle in radians
    static double getBearingRad(double lat1, double lon1, double lat2, double lon2)
    {
        double dLon = lon2 - lon1;
        double y = sin(dLon) * cos(lat2);
        double x = cos(lat1) * sin(lat2) - sin(lat1) * cos(lat2) * cos(dLon);
        double radiansBearing = atan2(y, x);

        return radiansBearing;
    }

    //start, end, point - coordinates in radians
    static double getCrossTrackDistanceRad(double lat1, double lon1, double lat2, double lon2, double lat3, double lon3)
    {
        double angDist1Q = haversineFormulaRadAngular(lat1, lon1, lat3, lon3);

        double cos_lat1 = cos(lat1);
        double sin_lat1 = sin(lat1);
        double cos_lat3 = cos(lat3);
        double cos_lat2 = cos(lat2);

        //double theta1Q = getBearing(lat1, lon1, lat3, lon3);
        // double dLon = lon3 - lon1;
        // double y = sin(dLon) * cos(lat3);
        // double x = cos(lat1) * sin(lat3) - sin(lat1) * cos(lat3) * cos(dLon);
        // double radiansBearing = atan2(y, x);
        double dLon1 = lon3 - lon1;
        double y1 = sin(dLon1) * cos_lat3;
        double x1 = cos_lat1 * sin(lat3) - sin_lat1 * cos_lat3 * cos(dLon1);
        double theta1Q = atan2(y1, x1);

        //double theta12 = getBearing(lat1, lon1, lat2, lon2);
        // double dLon = lon2 - lon1;
        // double y = sin(dLon) * cos(lat2);
        // double x = cos(lat1) * sin(lat2) - sin(lat1) * cos(lat2) * cos(dLon);
        // double radiansBearing = atan2(y, x);
        double dLon2 = lon2 - lon1;
        double y2 = sin(dLon2) * cos_lat2;
        double x2 = cos_lat1 * sin(lat2) - sin_lat1 * cos_lat2 * cos(dLon2);
        double theta12 = atan2(y2, x2);

        return asin(sin(angDist1Q) * sin(theta1Q - theta12)) * earth_radius_km;
    }

    static double latlngMinDistDeg(double &plat, double &plng, double &rminlat, double &rminlng, double &rmaxlat, double &rmaxlng)
    {
        return latlngMinDistRad(GeoDistance::deg2rad(plat), GeoDistance::deg2rad(plng),
                                GeoDistance::deg2rad(rminlat), GeoDistance::deg2rad(rminlng),
                                GeoDistance::deg2rad(rmaxlat), GeoDistance::deg2rad(rmaxlng));
    }

    //returns distance in kilometers
    static double latlngMinDistRad(double plat, double plng, double rminlat, double rminlng, double rmaxlat, double rmaxlng)
    {
        // The simplest case is when the query point is in the same "slice":
        if (rminlng <= plng && plng <= rmaxlng)
        {
            if (plat < rminlat)
            {
                return (rminlat - plat) * earth_radius_km; //Sout of MBR
            }
            else if (plat > rmaxlat)
            {
                return (plat - rmaxlat) * earth_radius_km; //North of MBR
            }
            return 0.0; // INSIDE MBR
        }

        // Determine whether going east or west is shorter.
        double lngW = fmod(rminlng - plng, 2.0 * M_PI);
        if (lngW < 0.0)
        {
            lngW += 2.0 * M_PI;
        }
        double lngE = fmod(plng - rmaxlng, 2.0 * M_PI);
        if (lngE < 0.0)
        {
            lngE += 2.0 * M_PI;
        }

        if (lngW <= lngE)
        {
            // West of MBR
            double tau = tan(plat);

            if (lngW >= 0.5 * M_PI) // Large delta of longitude
            {
                if (tau >= tan((rmaxlat + rminlat) * 0.5) * cos(rminlng - plng))
                {
                    return haversineFormulaRad(plat, plng, rmaxlat, rminlng); //North-West
                }
                else
                {
                    return haversineFormulaRad(plat, plng, rminlat, rminlng); //South-West
                }
            }

            if (tau >= tan(rmaxlat) * cos(rminlng - plng))
            {
                return haversineFormulaRad(plat, plng, rmaxlat, rminlng); //North-West
            }

            if (tau <= tan(rminlat) * cos(rminlng - plng))
            {
                return haversineFormulaRad(plat, plng, rminlat, rminlng); //South-West
            }

            return abs(getCrossTrackDistanceRad(rminlat, rminlng, rmaxlat, rminlng, plat, plng)); // West
        }
        else
        {
            // East of MBR
            double tau = tan(plat);

            if (lngE >= 0.5 * M_PI) // Large delta of longitude
            {
                if (tau >= tan((rmaxlat + rminlat) * 0.5) * cos(rmaxlng - plng))
                {
                    return haversineFormulaRad(plat, plng, rmaxlat, rmaxlng); //North-East
                }
                else
                {
                    return haversineFormulaRad(plat, plng, rminlat, rmaxlng); //Sout-East
                }
            }

            if (tau >= tan(rmaxlat) * cos(rmaxlng - plng))
            {
                return haversineFormulaRad(plat, plng, rmaxlat, rmaxlng); //North-East
            }

            if (tau <= tan(rminlat) * cos(rmaxlng - plng))
            {
                return haversineFormulaRad(plat, plng, rminlat, rmaxlng); //Sout-East
            }

            return abs(getCrossTrackDistanceRad(rmaxlat, rmaxlng, rminlat, rmaxlng, plat, plng)); // East
        }
}
 //different position of points from rectangle - see minidist_test.jpg for cases
    vector<double> MBR = {0.0, 0.0, 30.0, 60.0}; //latitude, longitude
    double rminlat = GeoDistance::deg2rad(MBR[0]);
    double rminlon = GeoDistance::deg2rad(MBR[1]);
    double rmaxlat = GeoDistance::deg2rad(MBR[2]);
    double rmaxlon = GeoDistance::deg2rad(MBR[3]);
    double plat, plon;

    //inside - p1
    vector<double> point = {15.0, 30.0};
    double distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, 0.0, "Inside Test");

    //north - p2
    point = {40.0, 25.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, (plat - rmaxlat) * earth_radius_km, "North Test");

    //south - p3
    point = {-15.0, 15.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, (rminlat - plat) * earth_radius_km, "South Test");

    //east - p4
    point = {15.0, 90.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, abs(WGS84Mindist::getCrossTrackDistanceRad(rmaxlat, rmaxlon, rminlat, rmaxlon, plat, plon)), "East Test");
    testTrue(abs(
                 abs(WGS84Mindist::getCrossTrackDistanceRad(rmaxlat, rmaxlon, rminlat, rmaxlon, plat, plon)) -
                 abs(WGS84Mindist::getCrossTrackDistanceRad(rminlat, rmaxlon, rmaxlat, rmaxlon, plat, plon))) < 0.001,
             "East Cross track path point order should not matter");

    //west - p5
    point = {20.0, -45.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, abs(WGS84Mindist::getCrossTrackDistanceRad(rminlat, rminlon, rmaxlat, rminlon, plat, plon)), "West Test");
    testTrue(abs(
                 abs(WGS84Mindist::getCrossTrackDistanceRad(rminlat, rminlon, rmaxlat, rminlon, plat, plon)) -
                 abs(WGS84Mindist::getCrossTrackDistanceRad(rmaxlat, rminlon, rminlat, rminlon, plat, plon))) < 0.001,
             "West Cross track path point order should not matter");

    //north-east with deltaLon < 90 - p6
    point = {45.0, 105.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, WGS84Mindist::haversineFormulaRad(plat, plon, rmaxlat, rmaxlon), "North-East close Test");

    //north-west with deltaLon < 90 - p7
    point = {60.0, -60.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, WGS84Mindist::haversineFormulaRad(plat, plon, rmaxlat, rminlon), "North-West close Test");

    //south-east with deltaLon < 90 - p8
    point = {-15.0, 90.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, WGS84Mindist::haversineFormulaRad(plat, plon, rminlat, rmaxlon), "South-East close Test");

    //south-west with deltaLon < 90 - p9
    point = {-15.0, -45.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, WGS84Mindist::haversineFormulaRad(plat, plon, rminlat, rminlon), "South-West close Test");

    //north-east with deltaLon >= 90 - p10
    point = {37.5, 165.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, WGS84Mindist::haversineFormulaRad(plat, plon, rmaxlat, rmaxlon), "North-East far Test");

    //north-west with deltaLon >= 90 - p11
    point = {37.5, -120.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, WGS84Mindist::haversineFormulaRad(plat, plon, rmaxlat, rminlon), "North-West far Test");

    //south-east with deltaLon >= 90 - p12
    point = {-30.0, 150.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, WGS84Mindist::haversineFormulaRad(plat, plon, rminlat, rmaxlon), "South-East far Test");

    //south-west with deltaLon >= 90 - p13
    point = {-30.0, -90.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, WGS84Mindist::haversineFormulaRad(plat, plon, rminlat, rminlon), "South-West far Test");
kno10 commented 7 years ago

You are missing some crucial test cases, because you placed your rectangle too close to the center.

  1. By placing it at (0,0), one of the difficult "use cross-track" borders becomes the equator, and you won't notice any error there. Instead, use e.g. the bounding boxes of Japan, and of Argentina.
  2. You have no test cases on the opposite side, i.e. in [0°;60°] - 180°. Place some test cases e.g. at -151° and -149°: at -151° the shortest path is westbound, at -149° it is eastbound.
  3. You need to test the far away case distinctions. I.e. points close to the equator, 89° from the rectangle sides, but where it is better to fly cross-track to somewhere along the side of the rectangle rather than to a corner. You want to test around the tip of the special XTD case.
  4. Also try adding a test case where flying across the north pole is best. This is actually very interesting, because there can be more than one best path, because the north and south boundaries of these rectangles are on circles of latitude; so the entire edge is equidistant from the pole. That is the reason we do not need to distinguish this case on the opposite side.
  5. Have a look at https://github.com/elki-project/elki/blob/e8f3c6fdf54e1e0aa8444b94ad5374ad518dfc0c/elki/src/main/java/de/lmu/ifi/dbs/elki/application/geo/VisualizeGeodesicDistances.java - such a bitmap approach has the nice property of using ("testing") thousands of points.

As I already mentioned above, these likely are not "two mistakes that fortunately cancel out", but the use of a negative lngW is consistent and mentioned in a comment ("we keep this negative"); this is a design decision not an accident. Likely to keep equations of both cases more similar. But I may reconsider this to make it less confusing.

Did the ELKI code get any of your test cases wrong?

Publishers will not update a printed book because of cosmetic changes like these. I could do them in an technical report, if I ever have reason to write one (e.g., adding the rectangle-to-rectangle case, which I haven't needed yet). The actual logic of the paper (which cases to distinguish, using cross-track-distance), as well as the experimental results, are fine; so the paper is up to all standards. It's fairly common to have to 'interpret' pseudo-code "as likely intended by the authors" when reading a paper, because nobody can afford the amount of proofreading needed to make a paper flawless to the character - and this is why I publish source code, too (but of course the real code is more complicated, as you noticed - it has to consider performance and precision much more). Papers, and even the most famous books, may have some minor things to improve that often won't get changed until a new edition is published (and there are no new editions of conference proceedings - if there is something important to add, you would publish a follow-up paper).

ghost commented 7 years ago

I will definitely test the other cases when i have time, but i think they should be fine, because i did everything with the accordance to what you write in your paper. Btw you did a great job by figuring out this method and i admire it. So don´t take it as harassment,but i just wanted to help other people too. I don´t call cosmetic change when you use inconsistent notation across the text. You say Phi denotes longitude and lambda denotes latitude, thats ok with pseudocode of algorithms. But in equations (1) and (2) phi should denote latitude. That´s all i am saying and i even send you PDF with simple annotations that take like 5 minutes to fix and you still refuse to just upload it. I get it it is also a part of some book, but when i viewed your text from my school,it didnt show me the whole book,but just your text. So i am asking you to fix that separate paper, not conference book.

//south-east with deltaLon >= 90 - p12
    point = {-30.0, 150.0};
    plat = GeoDistance::deg2rad(point[0]);
    plon = GeoDistance::deg2rad(point[1]);
    distance = WGS84Mindist::latlngMinDistDeg(point[0], point[1], MBR[0], MBR[1], MBR[2], MBR[3]);
    testEqual<double>(distance, WGS84Mindist::haversineFormulaRad(plat, plon, rminlat, rmaxlon), "South-East far Test");

You evaluate my test case for point p12=(-30,150) with MBR=(0,0,30,60) differently than you should. This is the edge case when the delta of longitudes is exactly 90 degrees(150-60). At this distance you proved that the great circle paths(depicted blue on your image in paper) intersect,so we should compare to the great circle path going through middle of MBR. But thats not what you doing, you go into another condition which tries to compare if it lies north or south from the separate great circle paths and not the middle one. So i think you should even include the case when its exactly 90 degrees,you even say that in text. if(lngE > MathUtil.HALFPI) { //should be >= , lines intersect at delta of 90

kno10 commented 7 years ago

What "separate paper"? There is only the conference proceedings book, although the publisher allows downloading individual contributions separately; you are still looking at a part of a book. There is no way to upload a new version there!

Also I realized one more thing that confused you: "East" in the comments can refer to "we have to travel east to the rectangle", or to "we are located east of the rectangle". These are of course the exact opposite.

I have added some unit tests to ELKI: https://github.com/elki-project/elki/commit/30a4573d90094bab871f1ee5014840ee2419eaef the reference distances are computed with a web tool, and only have around 4 digits of precision, but they can be used to test the different cases. Still, you may find these useful for testing your implementation.

At exactly 90°, this must yield the same distance. In fact, if it would make a difference, this would be a clear indication that there is an error there. The resulting distance must be smooth, there must not be sudden jumps (and that is why it is a good idea to also unit test approaching this point from different sides).