firebase / geofire-java

GeoFire for Java - Realtime location queries with Firebase
MIT License
670 stars 271 forks source link

Should GeoHashQuery include endValue or not? #155

Open algrid opened 5 years ago

algrid commented 5 years ago

Hello! Thanks for nice library.

I have a question about the following methods from GeoHashQuery.java:

    public static GeoHashQuery queryForGeoHash(GeoHash geohash, int bits) {
        String hash = geohash.getGeoHashString();
        int precision = (int)Math.ceil((double)bits/Base32Utils.BITS_PER_BASE32_CHAR);
        if (hash.length() < precision) {
            return new GeoHashQuery(hash, hash+"~");
        }
        hash = hash.substring(0, precision);
        String base = hash.substring(0, hash.length() - 1);
        int lastValue = Base32Utils.base32CharToValue(hash.charAt(hash.length() - 1));
        int significantBits = bits - (base.length() * Base32Utils.BITS_PER_BASE32_CHAR);
        int unusedBits = Base32Utils.BITS_PER_BASE32_CHAR - significantBits;
        // delete unused bits
        int startValue = (lastValue >> unusedBits) << unusedBits;
        int endValue = startValue + (1 << unusedBits);
        String startHash = base + Base32Utils.valueToBase32Char(startValue);
        String endHash;
        if (endValue > 31) {
            endHash = base + "~";
        } else {
            endHash = base + Base32Utils.valueToBase32Char(endValue);
        }
        return new GeoHashQuery(startHash, endHash);
    }

    public static Set<GeoHashQuery> queriesAtLocation(GeoLocation location, double radius) {
    ...
    }

I don't fully understand this code, but I have an impression that endValue from queryForGeoHash is not supposed to be included into the query. I mean that the query should be interpreted as startValue <= geohash < endValue but in fact when the query is executed it's inclusive: startValue <= geohash <= endValue.

To fix that we could change the line:

        int endValue = startValue + (1 << unusedBits);

into:

        int endValue = startValue + (1 << unusedBits) - 1;

To illustrate this, look at my results with center in Caracas, Venezuela and radius=350km (the selected area is what's included into the query):

1) Original version:

Screen Shot 2019-07-05 at 11 40 44 PM

2) My fixed version:

Screen Shot 2019-07-05 at 11 37 59 PM

Note: actually I'm using a kotlin version of the code from here: https://github.com/imperiumlabs/GeoFirestore-Android/blob/34b935c534c583520a706c3388c57962122846bf/geofirestore/src/main/java/org/imperiumlabs/geofirestore/core/GeoHashQuery.kt

but the logic is exactly the same, so I think that the code in this repo is (or is closer to) the original.

smohankrishna commented 5 years ago

Hii... Can you please tell me how to integrate geofire with spring boot project

tried using maven, but its not working

samtstern commented 5 years ago

@smohankrishna that sort of question is not relevant to this issue, please ask on StackOverflow.

samtstern commented 5 years ago

@algrid thanks for filing this with such detail! This looks like a sensible improvement to me but I don't know enough about geohashes to say for sure ... I am hoping that @puf can comment.

puf commented 5 years ago

I noticed the same outliers in my own testing with geohashes/geoqueries based on GeoFire, although never as bad as the db hash in your first screenshot. But I also had some queries where I didn't see outliers, so I assumed that it rounds up somewhere for edge cases. I wish I'd kept those queries around. :-/

algrid commented 5 years ago

@puf thanks for the info.

If we hit the endHash = base + "~"; case, then there should be no outliers, but otherwise they will exist. That's of course if my theory in the original post is valid.

But if there are cases with no outliers where the resulting endHash doesn't end with a ~, then the reality is more complex :)

joselpomares commented 1 year ago

Hello. I thinks that the increment in the endValue must be calculate in other form. imo: is correct the bitwise operations for startValue, leaving 0s in unusedBits, but for the endValue we need filter values until (inclusive) that start with significantBits, and no more (the current bitwise addition change the values of significantBits). For that we need put 1s in the unusedBits places, without change the significantBits values (endValue = startValue + (31 >> significantBits);). Finally, to get a "suffix inclusive" query, add "~" to end of endValue-geohash. I hope that the attached image help to explain my idea. Sorry by my Inglish.

Sin título