apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.73k stars 1.04k forks source link

Add GeoHash String Utilities to core GeoUtils [LUCENE-6647] #7705

Closed asfimport closed 9 years ago

asfimport commented 9 years ago

GeoPointField uses morton encoding to efficiently pack lat/lon values into a single long. GeoHashing effectively does the same thing but uses base 32 encoding to represent this long value as a "human readable" string. Many user applications already use the string representation of the hash. This issue simply adds the base32 string representation of the already computed morton code.


Migrated from LUCENE-6647 by Nick Knize (@nknize), resolved Aug 01 2015 Attachments: LUCENE-6647.patch (versions: 4)

asfimport commented 9 years ago

Nick Knize (@nknize) (migrated from JIRA)

Initial patch that adds GeoHash string utilities to GeoUtils.java

Currently only tested and validated against Elasticsearch. Will add unit tests to next patch.

asfimport commented 9 years ago

Nick Knize (@nknize) (migrated from JIRA)

Updated GeoHash patch with unit tests.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Thanks @nknize, the geohash utilities and tests look good.

But I hit this test failure:

   [junit4] Suite: org.apache.lucene.search.TestGeoPointQuery
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeoPointQuery -Dtests.method=testWholeMap -Dtests.seed=4949D67148502A2 -Dtests.locale=it -Dtests.timezone=Australia/Canberra -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 1.79s J3 | TestGeoPointQuery.testWholeMap <<<
   [junit4]    > Throwable #1: java.lang.AssertionError: testWholeMap failed expected:<15> but was:<16>
   [junit4]    >    at __randomizedtesting.SeedInfo.seed([4949D67148502A2:825F170DAFB39C04]:0)
   [junit4]    >    at org.apache.lucene.search.TestGeoPointQuery.testWholeMap(TestGeoPointQuery.java:181)
   [junit4]    >    at java.lang.Thread.run(Thread.java:745)
   [junit4] IGNOR/A 0.02s J3 | TestGeoPointQuery.testRandomBig
   [junit4]    > Assumption #1: 'nightly' test group is disabled (@Nightly())
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene53): {id=BlockTreeOrds(blocksize=128), geoField=Lucene50(blocksize=128)}, docValues:{id=DocValuesFormat(name=Lucene50)}, sim=RandomSimilarityProvider(queryNorm=false,coord=crazy): {}, locale=it, timezone=Australia/Canberra
   [junit4]   2> NOTE: Linux 3.13.0-46-generic amd64/Oracle Corporation 1.8.0_40 (64-bit)/cpus=8,threads=1,free=310567944,total=451936256
   [junit4]   2> NOTE: All tests run in this JVM: [TestSlowFuzzyQuery, TestDocValuesNumbersQuery, TestJakartaRegexpCapabilities, TestDocValuesTermsQuery, TestGeoPointQuery]
   [junit4] Completed [14/15] on J3 in 3.40s, 12 tests, 1 failure, 1 skipped <<< FAILURES!
asfimport commented 9 years ago

Nick Knize (@nknize) (migrated from JIRA)

Latest patch for #7762 changes mortonEncoding to use full 32bit precision for lat/lon values. This fixes the issue where the max lat/lon was not decoding to the correct precision leading to the failure posted above. A patch will be posted here that is compatible with the changes from #7762.

asfimport commented 9 years ago

Nick Knize (@nknize) (migrated from JIRA)

Updated patch that depends on #7762 - changes morton encoding to use full 64 bits, 32 bits per lat/lon.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

This fixes the issue where the max lat/lon was not decoding to the correct precision leading to the failure posted above.

Hmm can you open a new issue whose sole purpose is to cutover to full 32 bit precision for lat/lon? #7762 is about avoiding OOME (or is the full 32 precision necessary to avoid OOME?) ... then we can decouple these issues? It's hard enough keeping track of all the in-flight patches without some depending on others...

asfimport commented 9 years ago

Nick Knize (@nknize) (migrated from JIRA)

can you open a new issue whose sole purpose is to cutover to full 32 bit precision for lat/lon?

7768 adds full 32 bit precision decoupling this issue from #7762.

Patch attached to make GeoHashUtils bit precision independent. Unit test provided.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Thanks @nknize, I'll commit shortly...

asfimport commented 9 years ago

ASF subversion and git services (migrated from JIRA)

Commit 1693700 from @mikemccand in branch 'dev/trunk' https://svn.apache.org/r1693700

LUCENE-6647: add GeoHash string utility APIs

asfimport commented 9 years ago

ASF subversion and git services (migrated from JIRA)

Commit 1693702 from @mikemccand in branch 'dev/branches/branch_5x' https://svn.apache.org/r1693702

LUCENE-6647: add GeoHash string utility APIs

asfimport commented 9 years ago

Shalin Shekhar Mangar (@shalinmangar) (migrated from JIRA)

Bulk close for 5.3.0 release