apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.67k stars 1.03k forks source link

Integrate lat/long BKD and spatial 3d, part 2 [LUCENE-6759] #7817

Closed asfimport closed 9 years ago

asfimport commented 9 years ago

This is just a continuation of #7757, which became too big.


Migrated from LUCENE-6759 by Michael McCandless (@mikemccand), resolved Sep 02 2015 Attachments: LUCENE-6699.patch (versions: 17)

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Maybe, we should take a step back and accept that precision issues mean that points near a shape's boundary may or may not be accepted?

I.e. we can just relax the test so that any point within X of the boundary (can we compute this easily?) is not tested.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

I think you'll need to do that, yes. But there's more to it than that. See next comment.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

@mikemccand Here's the analysis, so far.

I first enabled evaluation of all four points where the XYZSolid intersected the planet surface. As you can see, only one of them comes back as being inside the GeoCircle:

   [junit4]   2>  Point 1.0010913867774043 0.007079167343247293 -0.0021855011220022575: shape.isWithin? true; minx=9.128715394490783E-6, maxx=-3.191195882434883E-6, miny=0.0, maxy=-4.618394311805439E-4, minz=0.0, maxz=-2.893784038207395E-4
   [junit4]   2>  Point 1.0010919806760743 0.007079167343247293 -0.001896122718181518: shape.isWithin? false; minx=9.722614064511248E-6, maxx=-2.597297212414418E-6, miny=0.0, maxy=-4.618394311805439E-4, minz=2.893784038207395E-4, maxz=0.0
   [junit4]   2>  Point 1.001088014365874 0.007541006774427837 -0.0021855011220022575: shape.isWithin? false; minx=5.7563038642349795E-6, maxx=-6.563607412690686E-6, miny=4.618394311805439E-4, maxy=0.0, minz=0.0, maxz=-2.893784038207395E-4
   [junit4]   2>  Point 1.0010886082665449 0.007541006774427837 -0.001896122718181518: shape.isWithin? false; minx=6.35020453509938E-6, maxx=-5.969706741826286E-6, miny=4.618394311805439E-4, maxy=0.0, minz=2.893784038207395E-4, maxz=0.0

If the above is an accurate picture, then there should be intersections between the GeoCircle and two of the edge planes.

miny should intersect:

   [junit4]   2>  Point 1.0010913867774043 0.007079167343247293 -0.0021855011220022575: shape.isWithin? true; minx=9.128715394490783E-6, maxx=-3.191195882434883E-6, miny=0.0, maxy=-4.618394311805439E-4, minz=0.0, maxz=-2.893784038207395E-4
   [junit4]   2>  Point 1.0010919806760743 0.007079167343247293 -0.001896122718181518: shape.isWithin? false; minx=9.722614064511248E-6, maxx=-2.597297212414418E-6, miny=0.0, maxy=-4.618394311805439E-4, minz=2.893784038207395E-4, maxz=0.0

And, minz should intersect:

   [junit4]   2>  Point 1.0010913867774043 0.007079167343247293 -0.0021855011220022575: shape.isWithin? true; minx=9.128715394490783E-6, maxx=-3.191195882434883E-6, miny=0.0, maxy=-4.618394311805439E-4, minz=0.0, maxz=-2.893784038207395E-4
   [junit4]   2>  Point 1.001088014365874 0.007541006774427837 -0.0021855011220022575: shape.isWithin? false; minx=5.7563038642349795E-6, maxx=-6.563607412690686E-6, miny=4.618394311805439E-4, maxy=0.0, minz=0.0, maxz=-2.893784038207395E-4

These two intersections are not being detected, and after much careful analysis, I concluded that the reason that they are not being detected is because no intersection actually happens. Looking at the miny plane:

   [junit4]   2> Checking for intersections that should be found...
   [junit4]   2>  Not identical plane
   [junit4]   2> Looking for intersection between plane [A=-0.9999680546313309, B=-0.0046605790633783275, C=0.006493744653569968, D=1.0011065916522879, side=-1.0] and plane [A=0.0, B=1.0, C=0.0, D=-0.007079167343247293, side=1.0] within bounds
   [junit4]   2>  Two points of intersection
   [junit4]   2>   [X=1.0010359045488204, Y=0.0070791673432472925, Z=-0.010729178478687706] this=(0.0) q=(-8.673617379884035E-19), and [X=1.0010913867758835, Y=0.0070791673432472925, Z=-0.0021855018140558226] this=(0.0) q=(-8.673617379884035E-19)

Two points of intersection are detected, but both are outside the X or Z bounds of the XYZSolid, so they do not represent intersection.

So, how can this be? Well, the reason for the discrepancy is because the first point of the four mentioned at the top is, in fact, not really inside the GeoCircle. It is coming up as being inside the GeoCircle only because of the fact that we've increased MINIMUM_RESOLUTION from its original value of 1e-12:

   [junit4]   2> circlePlane eval = 2.9731772599461692E-12

So the problem is that ONE measure of error (point within GeoCircle) disagrees with another measure of error (intersection points in or out of XYZSolid), leading to an incorrect assessment.

This is obviously going to be challenging to address. I may need to introduce two distinct error bounds in order for this logic to be robust. But I have to think it through carefully.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Thinking through why one error measure just doesn't work in this case:

(1) The point in question is on the wrong side of the GeoCircle plane, and is pretty nearly the closest point on the XYZSolid to the GeoCircle plane. (2) The slight tilt of the GeoCircle plane is enough to put the upper intersection point beyond the MINIMUM_RESOLUTION distance (at a distance of roughly 1e-8). That removes the first candidate intersection point from consideration. (3) Because the GeoCircle is almost at the edge of the world, the slope between the GeoCircle plane and the planet surface is quite high, so a small error distance in X translates to a large distance in Y or Z. That allows the second candidate intersection point to be removed from consideration.

The obvious conclusion is that we can tolerate no error at all in determining if a point is within a shape or not, for the purposes of evaluating relationships. It's not clear to me yet whether we need to simply tighten the existing definition of "isWithin()", or we need to have multiple variants of "isWithin()". Further analysis is needed to figure that out.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Changing isWithin() to be stricter globally seems like a reasonable way to go. However, this requires SidedPlane to have two different isWithin() methods anyhow. So I think I'll just bite the bullet and introduce isWithinStrict() as a new part of the Membership interface.

This will take some time to propagate and test.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

@mikemccand This patch reverts MINIMUM_RESOLUTION to 1e-12, and fixes the requirement for a high MINIMUM_RESOLUTION value another way.

All tests pass when this is done. Since I've been adding tests every time your beasting finds something, that's meaningful.

Please bear in mind that this is probably not the final patch. A final patch will have an even lower value for MINIMUM_RESOLUTION and probably more fixes designed to lower error values further. But it does address the current issues, and the only way to know what breaks next is to pound on it.

I'd also like to know what exactly you do to "beast" this patch, so that I may do the same here.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

@DaddyWri thank you, I committed that last patch, but noticed GeoCircleTest.testCircleBounds is angry:

   [junit4] Suite: org.apache.lucene.geo3d.GeoCircleTest
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=GeoCircleTest -Dtests.method=testCircleBounds -Dtests.seed=8068B689836F03CE -Dtests.locale=es_PR -Dtests.timezone=Europe/Zagreb -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] FAILURE 0.05s J1 | GeoCircleTest.testCircleBounds <<<
   [junit4]    > Throwable #1: java.lang.AssertionError
   [junit4]    >    at __randomizedtesting.SeedInfo.seed([8068B689836F03CE:75EBD17B83B5147A]:0)
   [junit4]    >    at org.apache.lucene.geo3d.GeoCircleTest.testCircleBounds(GeoCircleTest.java:111)
   [junit4]    >    at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene53): {}, docValues:{}, sim=RandomSimilarityProvider(queryNorm=true,coord=crazy): {}, locale=es_PR, timezone=Europe/Zagreb
   [junit4]   2> NOTE: Linux 3.13.0-46-generic amd64/Oracle Corporation 1.8.0_40 (64-bit)/cpus=8,threads=1,free=425655240,total=504889344
   [junit4]   2> NOTE: All tests run in this JVM: [GeoCircleTest]
   [junit4] Completed [6/9] on J1 in 0.27s, 4 tests, 1 failure <<< FAILURES!
asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

I'd also like to know what exactly you do to "beast" this patch, so that I may do the same here.

I use the repeatLuceneTest.py from luceneutil, but ant beast should work well too, something like:

ant beast -Dbeast.iters=100 -Dtestcase=TestGeo3DPointField -Dtestmethod=testRandomMedium -Dtests.dups=6 -Dtests.iters=10

will run 6 JVMs concurrently (I think?), each JVM repeating this one test method 10 times w/ the same master seed, and those 6 JVMs will stop and start 100 times.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Hmm, I don't see that in my workarea (before synching anyway). Let me dig.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

@mikemccand ant clean test in my workarea yields:

-test:
    [mkdir] Created dir: C:\wip\lucene\lucene6699\lucene\build\spatial3d\test
    [mkdir] Created dir: C:\wip\lucene\lucene6699\lucene\build\spatial3d\test\te
mp
   [junit4] <JUnit4> says ??! Master seed: C9F7F99B0030A6AB
   [junit4] Your default console's encoding may not display certain unicode glyp
hs: windows-1252
   [junit4] Executing 9 suites with 3 JVMs.
   [junit4]
   [junit4] Started J1 PID(8768@localhost).
   [junit4] Started J2 PID(1576@localhost).
   [junit4] Started J0 PID(9308@localhost).
   [junit4] Suite: org.apache.lucene.geo3d.GeoCircleTest
   [junit4] Completed [1/9] on J1 in 0.19s, 4 tests
   [junit4]
   [junit4] Suite: org.apache.lucene.geo3d.GeoBBoxTest
   [junit4] Completed [2/9] on J2 in 0.23s, 4 tests
   [junit4]
   [junit4] Suite: org.apache.lucene.geo3d.GeoModelTest
   [junit4] Completed [3/9] on J1 in 0.01s, 2 tests
   [junit4]
   [junit4] Suite: org.apache.lucene.geo3d.GeoPolygonTest
   [junit4] Completed [4/9] on J2 in 0.01s, 2 tests
   [junit4]
   [junit4] Suite: org.apache.lucene.geo3d.PlaneTest
   [junit4] Completed [5/9] on J1 in 0.01s, 2 tests
   [junit4]
   [junit4] Suite: org.apache.lucene.geo3d.XYZSolidTest
   [junit4] Completed [6/9] on J2 in 0.05s, 2 tests
   [junit4]
   [junit4] Suite: org.apache.lucene.geo3d.GeoPathTest
   [junit4] Completed [7/9] on J1 in 0.03s, 5 tests
   [junit4]
   [junit4] Suite: org.apache.lucene.geo3d.GeoConvexPolygonTest
   [junit4] Completed [8/9] on J2 in 0.00s, 2 tests
   [junit4]
   [junit4] Suite: org.apache.lucene.bkdtree3d.TestGeo3DPointField
   [junit4] IGNOR/A 0.09s J0 | TestGeo3DPointField.testRandomBig
   [junit4]    > Assumption #1: 'nightly' test group is disabled (@Nightly())
   [junit4] Completed [9/9] on J0 in 5.18s, 6 tests, 1 skipped
   [junit4]
   [junit4] JVM J0:     1.86 ..     8.29 =     6.43s
   [junit4] JVM J1:     1.86 ..     3.37 =     1.51s
   [junit4] JVM J2:     1.86 ..     3.36 =     1.50s
   [junit4] Execution time total: 8.30 sec.
   [junit4] Tests summary: 9 suites, 29 tests, 1 ignored (1 assumption)
     [echo] 5 slowest tests:
[junit4:tophints]   9.50s | org.apache.lucene.bkdtree3d.TestGeo3DPointField
[junit4:tophints]   1.73s | org.apache.lucene.bkdtree3d.TestBKD3DTree
[junit4:tophints]   0.20s | org.apache.lucene.geo3d.XYZSolidTest
[junit4:tophints]   0.19s | org.apache.lucene.geo3d.GeoCircleTest
[junit4:tophints]   0.16s | org.apache.lucene.geo3d.GeoBBoxTest

-check-totals:

common.test:

BUILD SUCCESSFUL
Total time: 16 seconds

After sync:

C:\wip\lucene\lucene6699\lucene\spatial3d>svn status
?       capture

C:\wip\lucene\lucene6699\lucene\spatial3d>

A repeat "ant clean test" also succeeds at that point. So I'm puzzled. Did you run "ant clean" first? Changing MINIMUM_RESOLUTION does require that, seemingly...

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Running the beaster made it through 6 rounds, but then failed with this:

  [beaster]   2> sie 24, 2015 3:48:21 PM com.carrotsearch.randomizedtesting.Rand
omizedRunner$QueueUncaughtExceptionsHandler uncaughtException
  [beaster]   2> WARNING: Uncaught exception in thread: Thread[T0,5,TGRP-TestGeo
3DPointField]
  [beaster]   2> java.lang.AssertionError: expected WITHIN (1) or OVERLAPS (2) b
ut got 3; shape=GeoCircle: {planetmodel=PlanetModel.SPHERE, center=[lat=-0.00216
27146783861745, lon=-0.0017298167021592304], radius=2.0818312293195752E-4(0.0119
28014309854351)}; XYZSolid=XYZSolid: {planetmodel=PlanetModel.SPHERE, isWholeWor
ld=false, minXplane=[A=1.0, B=0.0, C=0.0, D=-0.9999955669921241, side=1.0], maxX
plane=[A=1.0, B=0.0, C=0.0, D=-0.9999967200767939, side=-1.0], minYplane=[A=0.0,
 B=1.0, C=0.0, D=0.0019379945667919352, side=1.0], maxYplane=[A=0.0, B=1.0, C=0.
0, D=0.0015216289462746052, side=-1.0], minZplane=[A=0.0, B=0.0, C=1.0, D=0.0023
708955797907497, side=1.0], maxZplane=[A=0.0, B=0.0, C=1.0, D=0.0019545303111802
707, side=-1.0]}
  [beaster]   2>        at __randomizedtesting.SeedInfo.seed([485BDCE0789B5CDC]:
0)
  [beaster]   2>        at org.apache.lucene.bkdtree3d.PointInGeo3DShapeQuery$1.
scorer(PointInGeo3DShapeQuery.java:105)
  [beaster]   2>        at org.apache.lucene.search.LRUQueryCache$CachingWrapper
Weight.scorer(LRUQueryCache.java:589)
  [beaster]   2>        at org.apache.lucene.search.Weight.bulkScorer(Weight.jav
a:135)
  [beaster]   2>        at org.apache.lucene.search.AssertingWeight.bulkScorer(A
ssertingWeight.java:69)
  [beaster]   2>        at org.apache.lucene.search.AssertingWeight.bulkScorer(A
ssertingWeight.java:69)
  [beaster]   2>        at org.apache.lucene.search.IndexSearcher.search(IndexSe
archer.java:618)
  [beaster]   2>        at org.apache.lucene.search.AssertingIndexSearcher.searc
h(AssertingIndexSearcher.java:92)
  [beaster]   2>        at org.apache.lucene.search.IndexSearcher.search(IndexSe
archer.java:425)
  [beaster]   2>        at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4._ru
n(TestGeo3DPointField.java:587)
  [beaster]   2>        at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run
(TestGeo3DPointField.java:521)
  [beaster]   2>
  [beaster]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeo3DPointField
-Dtests.method=testRandomMedium -Dtests.seed=485BDCE0789B5CDC -Dtests.slow=true
-Dtests.locale=pl_PL -Dtests.timezone=Africa/Tripoli -Dtests.asserts=true -Dtest
s.file.encoding=ISO-8859-1
  [beaster] [09:48:13.669] ERROR   11.3s J0 | TestGeo3DPointField.testRandomMedi
um {#0 seed=[485BDCE0789B5CDC:F585EB4839FE3FBA]} <<<
  [beaster]    > Throwable #1: com.carrotsearch.randomizedtesting.UncaughtExcept
ionError: Captured an uncaught exception in thread: Thread[id=17, name=T0, state
=RUNNABLE, group=TGRP-TestGeo3DPointField]
  [beaster]    >        at __randomizedtesting.SeedInfo.seed([485BDCE0789B5CDC:F
585EB4839FE3FBA]:0)
  [beaster]    > Caused by: java.lang.AssertionError: expected WITHIN (1) or OVE
RLAPS (2) but got 3; shape=GeoCircle: {planetmodel=PlanetModel.SPHERE, center=[l
at=-0.0021627146783861745, lon=-0.0017298167021592304], radius=2.081831229319575
2E-4(0.011928014309854351)}; XYZSolid=XYZSolid: {planetmodel=PlanetModel.SPHERE,
 isWholeWorld=false, minXplane=[A=1.0, B=0.0, C=0.0, D=-0.9999955669921241, side
=1.0], maxXplane=[A=1.0, B=0.0, C=0.0, D=-0.9999967200767939, side=-1.0], minYpl
ane=[A=0.0, B=1.0, C=0.0, D=0.0019379945667919352, side=1.0], maxYplane=[A=0.0,
B=1.0, C=0.0, D=0.0015216289462746052, side=-1.0], minZplane=[A=0.0, B=0.0, C=1.
0, D=0.0023708955797907497, side=1.0], maxZplane=[A=0.0, B=0.0, C=1.0, D=0.00195
45303111802707, side=-1.0]}
  [beaster]    >        at __randomizedtesting.SeedInfo.seed([485BDCE0789B5CDC]:
0)
  [beaster]    >        at org.apache.lucene.bkdtree3d.PointInGeo3DShapeQuery$1.
scorer(PointInGeo3DShapeQuery.java:105)
  [beaster]    >        at org.apache.lucene.search.LRUQueryCache$CachingWrapper
Weight.scorer(LRUQueryCache.java:589)
  [beaster]    >        at org.apache.lucene.search.Weight.bulkScorer(Weight.jav
a:135)
  [beaster]    >        at org.apache.lucene.search.AssertingWeight.bulkScorer(A
ssertingWeight.java:69)
  [beaster]    >        at org.apache.lucene.search.AssertingWeight.bulkScorer(A
ssertingWeight.java:69)
  [beaster]    >        at org.apache.lucene.search.IndexSearcher.search(IndexSe
archer.java:618)
  [beaster]    >        at org.apache.lucene.search.AssertingIndexSearcher.searc
h(AssertingIndexSearcher.java:92)
  [beaster]    >        at org.apache.lucene.search.IndexSearcher.search(IndexSe
archer.java:425)
  [beaster]    >        at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4._ru
n(TestGeo3DPointField.java:587)
  [beaster]    >        at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run
(TestGeo3DPointField.java:521)
  [beaster]   2> NOTE: test params are: codec=Asserting(Lucene53): {}, docValues
:{}, sim=RandomSimilarityProvider(queryNorm=false,coord=no): {}, locale=pl_PL, t
imezone=Africa/Tripoli
  [beaster]   2> NOTE: Windows 7 6.1 amd64/Oracle Corporation 1.8.0_05 (64-bit)/
cpus=4,threads=1,free=154135784,total=308805632
  [beaster]   2> NOTE: All tests run in this JVM: [TestGeo3DPointField]
  [beaster]
  [beaster] Tests with failures:
  [beaster]   - org.apache.lucene.bkdtree3d.TestGeo3DPointField.testRandomMedium
 {#0 seed=[485BDCE0789B5CDC:F585EB4839FE3FBA]}
  [beaster]
  [beaster]

I will have a look at this as soon as possible.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Coding this up as its own explicit test yields the following:

   [junit4]   2> Shape edge points:
   [junit4]   2>  Point 0.9999965937741284 -0.0017298125353661473 -0.001954530310113696: isWithin? false; minx=1.0267820043097231E-6, maxx=-1.2630266543744995E-7, miny=2.0818203142578787E-4, maxy=-2.0818358909154206E-4, minz=4.1636526967705383E-4, maxz=1.0665747798843661E-12

So the bounds object that's computed fails to contain the edgepoint for the circle, by a very small amount (1.0665747798843661E-12). That's 7% larger than the MINIMUM_RESOLUTION value.

I'm going to look first at whether there are any ways I can think of to reduce the error. First I'll see how far the edgepoint is from the circle plane. Presuming that's not the source of most of the error, then the next step would be to look at the bounds computation itself for error reduction. If all else fails, then MINIMUM_RESOLUTION will have to grow by 7%.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

   [junit4]   2> Distance to circle plane = 2.220446049250313E-16

... which is 4 orders of magnitude less than MINIMUM_RESOLUTION. Clearly not the problem. Looking at getBounds() next...

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

getBounds() also seems to have low error when computing the Z bound:

   [junit4]   2> this.evaluate(point)=1.1102230246251565E-16; normalizedZPlane.evaluate(point)=0.0
   [junit4]   2> this.evaluate(point)=0.0; normalizedZPlane.evaluate(point)=2.1684043449710089E-19
asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

I wound up discovering that the delta between a shape and the XYZBounds that might be returned for a shape was more than 10x 1.5e-12. That's not a number I can increase MINIMUM_RESOLUTION to, unfortunately.

The cause of the delta is that very same case of a very small circle on or about z=0. The delta between the geocircle's plane and the maxz value may only be 1e-16, but if the circle is small enough that delta translates to a delta in Z of 5e-11 or so, which is way outside the MINIMUM_RESOLUTION in z.

The solution I'm exploring now is to simply add a "fudge factor" to all bounds values. This fudge factor is designed to cover any deltas due to error values being magnified in this way. So far (beasting round 20) it seems to be working. I may also reduce MINIMUM_RESOLUTION to a lower value if this seems to be effective for all of our test cases.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

@mikemccand: Here's a patch that allows beasting to succeed.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Thanks @DaddyWri I committed that last patch...

Hmm, I don't see that in my workarea (before synching anyway). Let me dig.

Ugh sorry, when I did an "ant clean" then this test stopped failing ...

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

206 beast iters then I hit:

[junit4:pickseed] Seed property 'tests.seed' already defined: DDC21670DAEA1F6B
   [junit4] <JUnit4> says ciao! Master seed: DDC21670DAEA1F6B
   [junit4] Executing 1 suite with 1 JVM.
   [junit4] 
   [junit4] Started J0 PID(16168@localhost).
   [junit4] Suite: org.apache.lucene.bkdtree3d.TestGeo3DPointField
   [junit4]   2> ago 24, 2015 8:15:51 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
   [junit4]   2> WARNING: Uncaught exception in thread: Thread[T1,5,TGRP-TestGeo3DPointField]
   [junit4]   2> java.lang.AssertionError: expected WITHIN (1) or OVERLAPS (2) but got 0; shape=GeoCircle: {planetmodel=PlanetModel.SPHERE, center=[lat=-0.004431288600558495, lon=-0.003687846671278374], radius=1.704543429364245E-8(9.7663144499327E-7)}; XYZSolid=XYZSolid: {planetmodel=PlanetModel.SPHERE, isWholeWorld=false, minXplane=[A=1.0, B=0.0, C=0.0, D=-0.9999833816746712, side=1.0], maxXplane=[A=1.0, B=0.0, C=0.0, D=-0.9999833819746712, side=-1.0], minYplane=[A=0.0, B=1.0, C=0.0, D=0.00368780225430909, side=1.0], maxYplane=[A=0.0, B=1.0, C=0.0, D=0.00368780195430909, side=-1.0], minZplane=[A=0.0, B=0.0, C=1.0, D=0.004431274248206893, side=1.0], maxZplane=[A=0.0, B=0.0, C=1.0, D=0.004431273948206893, side=-1.0]}
   [junit4]   2>    at __randomizedtesting.SeedInfo.seed([DDC21670DAEA1F6B]:0)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.PointInGeo3DShapeQuery$1.scorer(PointInGeo3DShapeQuery.java:105)
   [junit4]   2>    at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:581)
   [junit4]   2>    at org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
   [junit4]   2>    at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:69)
   [junit4]   2>    at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:69)
   [junit4]   2>    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
   [junit4]   2>    at org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:92)
   [junit4]   2>    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:425)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4._run(TestGeo3DPointField.java:587)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:521)
   [junit4]   2> 
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeo3DPointField -Dtests.method=testRandomMedium -Dtests.seed=DDC21670DAEA1F6B -Dtests.slow=true -Dtests.linedocsfile=/lucenedata/hudson.enwiki.random.lines.txt.fixed -Dtests.locale=it_CH -Dtests.timezone=Africa/Blantyre -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   1.08s | TestGeo3DPointField.testRandomMedium <<<
   [junit4]    > Throwable #1: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=16, name=T1, state=RUNNABLE, group=TGRP-TestGeo3DPointField]

Looks like a miniscule radius?

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

yup, when the radius gets that small, the error gets magnified enormously. Essentially it becomes infinite when the radius becomes zero. ;-) But practically speaking, anything less than MINIMUM_RESOLUTION will be rejected out of hand. So probably the right thing to do is just to multiply the fudge factor by an additional factor of 10. I'll also add another test.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

@mikemccand: Increase the fudge factor still more, to account for an even more ridiculously small radius.

I'm hoping this does it.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

OK after a long time beasting, I hit this:

   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeo3DPointField -Dtests.method=testRandomMedium -Dtests.seed=E5F94C1E10DF27A2 -Dtests.multiplier=5 -Dtests.locale=fr_FR -Dtests.timezone=Africa/Djibouti -Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   8.48s | TestGeo3DPointField.testRandomMedium <<<
   [junit4]    > Throwable #1: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=15, name=T1, state=RUNNABLE, group=TGRP-TestGeo3DPointField]
   [junit4]    >    at __randomizedtesting.SeedInfo.seed([E5F94C1E10DF27A2:58277BB651BA44C4]:0)
   [junit4]    > Caused by: java.lang.RuntimeException: java.lang.RuntimeException: FAILED
   [junit4]    >    at __randomizedtesting.SeedInfo.seed([E5F94C1E10DF27A2]:0)
   [junit4]    >    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:524)
   [junit4]    > Caused by: java.lang.RuntimeException: FAILED
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.addAll(BKD3DTreeReader.java:159)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:205)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:307)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:321)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:321)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:307)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:297)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:331)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:282)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:297)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:331)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:115)
   [junit4]    >    at org.apache.lucene.bkdtree3d.PointInGeo3DShapeQuery$1.scorer(PointInGeo3DShapeQuery.java:114)
   [junit4]    >    at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:581)
   [junit4]    >    at org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
   [junit4]    >    at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:69)
   [junit4]    >    at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:69)
   [junit4]    >    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
   [junit4]    >    at org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:92)
   [junit4]    >    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:425)
   [junit4]    >    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4._run(TestGeo3DPointField.java:587)
   [junit4]    >    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:521)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene53): {}, docValues:{}, sim=RandomSimilarityProvider(queryNorm=false,coord=crazy): {}, locale=fr_FR, timezone=Africa/Djibouti
   [junit4]   2> NOTE: Linux 3.13.0-46-generic amd64/Oracle Corporation 1.8.0_40 (64-bit)/cpus=8,threads=1,free=408338576,total=455081984
   [junit4]   2> NOTE: All tests run in this JVM: [TestGeo3DPointField]

It's a case where we checked up above that a BKD cell was fully contained in the shape, but then we assert every point we see inside that cell is also within the shape, and that failed...

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

The solution to this problem is the reduce the MINIMUM_RESOLUTION. But I'd like to code up the specific case in order to be sure I catch it. Would you be able to find details, as you did last time?

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

@mikemccand: Here's a patch that addresses the latest failure, by halving MINIMUM_RESOLUTION and doubling FUDGE_FACTOR.

I don't have a standalone test for this case because I'm not quite sure the best way to extract it from the failure.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Another failure, where the cell is within the shape, so BKD tree recurses into addAll, yet a doc within the cell is not within the shape:

   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeo3DPointField -Dtests.method=testRandomMedium -Dtests.seed=D75138C6C25D1BCF -Dtests.multiplier=10 -Dtests.slow=true -Dtests.locale=de_GR -Dtests.timezone=America/Managua -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
   [junit4] ERROR   9.88s | TestGeo3DPointField.testRandomMedium <<<
   [junit4]    > Throwable #1: com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=28, name=T0, state=RUNNABLE, group=TGRP-TestGeo3DPointField]
   [junit4]    >    at __randomizedtesting.SeedInfo.seed([D75138C6C25D1BCF:6A8F0F6E833878A9]:0)
   [junit4]    > Caused by: java.lang.RuntimeException: java.lang.RuntimeException: FAILED
   [junit4]    >    at __randomizedtesting.SeedInfo.seed([D75138C6C25D1BCF]:0)
   [junit4]    >    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:524)
   [junit4]    > Caused by: java.lang.RuntimeException: FAILED
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.addAll(BKD3DTreeReader.java:159)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:203)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:329)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:295)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:305)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:329)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:319)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:295)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:280)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:270)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:270)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:295)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:319)
   [junit4]    >    at org.apache.lucene.bkdtree3d.BKD3DTreeReader.intersect(BKD3DTreeReader.java:115)
   [junit4]    >    at org.apache.lucene.bkdtree3d.PointInGeo3DShapeQuery$1.scorer(PointInGeo3DShapeQuery.java:114)
   [junit4]    >    at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:589)
   [junit4]    >    at org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
   [junit4]    >    at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:69)
   [junit4]    >    at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:69)
   [junit4]    >    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
   [junit4]    >    at org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:92)
   [junit4]    >    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:425)
   [junit4]    >    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4._run(TestGeo3DPointField.java:587)
   [junit4]    >    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:521)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene53): {}, docValues:{}, sim=RandomSimilarityProvider(queryNorm=true,coord=crazy): {}, locale=de_GR, timezone=America/Managua
   [junit4]   2> NOTE: Linux 3.19.0-21-generic amd64/Oracle Corporation 1.8.0_51 (64-bit)/cpus=72,threads=1,free=426307304,total=504889344
   [junit4]   2> NOTE: All tests run in this JVM: [TestGeo3DPointField]

I added verbosity and extracted the details:

Here's the query shape:

   [junit4]   2> Thread[T0,5,TGRP-TestGeo3DPointField]: TEST: iter=64 shape=GeoCircle: {planetmodel=PlanetModel.WGS84, center=[lat=-7.573175600018171E-4, lon=-0.001184769535031697], radius=0.007585721238160122(0.4346298115093282)}

BKD switched to addAll when this cell was contained inside the shape:

   [junit4]   1> Thread[T0,5,TGRP-TestGeo3DPointField]: switch to addAll at cell x=1.0010740213026637 to 1.0010824106377934 y=-0.007656353133570567 to -0.007315722331086044 z=-0.0047688666958216885 to -0.0042476080955227875

But then this doc (which is within the cell) is supposedly not within the shape:

   [junit4]   1> T0:  accept docID=71226 point: x=1.0010781049211872 y=-0.007656353133570567 z=-0.0047688666958216885
   [junit4]   1> 
   [junit4]   1> T0: FAILED: docID=71226
asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

This one is different. Basically, this time the XYZSolid is definitely within the GeoCircle. So the membership operation between the point and the circle is failing but should not be. Evaluating the circle plane at the point yields: 1.1158007851008733E-10, which is indeed outside the shape value. But, the point is clearly not quite on the surface, so that can happen.

So here we have a case where the packing resolution is definitely causing the assertion failure.

What I think better behavior might be is to simply disable the assert. This will mean that points that are technically outside the shape will still get returned once in a while, but I imagine that this would basically just wind up making the boundary a bit fuzzy. I could add a method that would check for membership at a lower resolution but that's a fair bit of work just to support the one assertion.

Thoughts?

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

OK I'll disable this assert: it is too anal.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Oh wait, we need to do more than simply disable the assert, because the test will still fail, just a bit later when it verifies all hits (the assert was just "early detection"):

   [junit4] Suite: org.apache.lucene.bkdtree3d.TestGeo3DPointField
   [junit4]   2> Aug 26, 2015 8:48:39 AM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
   [junit4]   2> WARNING: Uncaught exception in thread: Thread[T0,5,TGRP-TestGeo3DPointField]
   [junit4]   2> java.lang.AssertionError: T0: iter=63 id=71226 docID=71226 lat=-0.004763555725376775 lon=-0.0076479587074575126 expected false but got: true deleted?=false
   [junit4]   2>   point1=[lat=-0.004763555725376775, lon=-0.0076479587074575126], iswithin=true
   [junit4]   2>   point2=[X=1.0010781049211872, Y=-0.007656353133570567, Z=-0.0047688666958216885], iswithin=false
   [junit4]   2>   query=PointInGeo3DShapeQuery: field=point:PlanetModel: PlanetModel.WGS84 Shape: GeoCircle: {planetmodel=PlanetModel.WGS84, center=[lat=-7.573175600018171E-4, lon=-0.001184769535031697], radius=0.007585721238160122(0.4346298115093282)}
   [junit4]   2>    at __randomizedtesting.SeedInfo.seed([D75138C6C25D1BCF]:0)
   [junit4]   2>    at org.junit.Assert.fail(Assert.java:93)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4._run(TestGeo3DPointField.java:625)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:521)
   [junit4]   2> 
   [junit4]   2> Aug 26, 2015 8:48:40 AM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
   [junit4]   2> WARNING: Uncaught exception in thread: Thread[T2,5,TGRP-TestGeo3DPointField]
   [junit4]   2> java.lang.AssertionError: T2: iter=62 id=71226 docID=71226 lat=-0.004763555725376775 lon=-0.0076479587074575126 expected false but got: true deleted?=false
   [junit4]   2>   point1=[lat=-0.004763555725376775, lon=-0.0076479587074575126], iswithin=true
   [junit4]   2>   point2=[X=1.0010781049211872, Y=-0.007656353133570567, Z=-0.0047688666958216885], iswithin=false
   [junit4]   2>   query=PointInGeo3DShapeQuery: field=point:PlanetModel: PlanetModel.WGS84 Shape: GeoCircle: {planetmodel=PlanetModel.WGS84, center=[lat=-7.573175600018171E-4, lon=-0.001184769535031697], radius=0.007585721238160122(0.4346298115093282)}
   [junit4]   2>    at __randomizedtesting.SeedInfo.seed([D75138C6C25D1BCF]:0)
   [junit4]   2>    at org.junit.Assert.fail(Assert.java:93)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4._run(TestGeo3DPointField.java:625)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:521)
   [junit4]   2> 
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeo3DPointField -Dtests.method=testRandomMedium -Dtests.seed=D75138C6C25D1BCF -Dtests.multiplier=10 -Dtests.slow=true -Dtests.locale=de_GR -Dtests.timezone=America/Managua -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
   [junit4] ERROR   62.4s | TestGeo3DPointField.testRandomMedium <<<

So how to correspondingly fix the test? Right now, in (intentionally) quantizes the double x,y,z of the point to match what the doc values pack/unpack did ...

Maybe, we could just fix the test so that if isWithin differs between the quantized and unquantized x,y,z, we skip checking that hit?

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

@mikemccand Yes, that should do it. If you have both, anyways. ;-)

asfimport commented 9 years ago

David Smiley (@dsmiley) (migrated from JIRA)

Maybe, we could just fix the test so that if isWithin differs between the quantized and unquantized x,y,z, we skip checking that hit?

Yes; this is also the approach done RandomSpatialOpFuzzyPrefixTreeTest. The "fuzzy" in the name here because of the issue being discussed.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

ok, I'm finally done travelling for the moment. @mikemccand, where do things stand? I notice that my latest patch didn't get committed, FWIW. Also, do you want me to implement your idea of having both isWithin's need to pass before the assert triggers?

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Ugh sorry I thought I had committed the latest patch ... I'll do that shortly.

And I'll also fix the test to skip checking a hit when the quantization changed the expected result...

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

OK I committed, beasted, hit this failure:

ant test  -Dtestcase=TestGeo3DPointField -Dtests.method=testRandomMedium* -Dtests.seed=E1D51F3E8B12E79D -Dtests.multiplier=10 -Dtests.iters=5 -Dtests.slow=true -Dtests.linedocsfile=/lucenedata/hudson.enwiki.random.lines.txt.fixed -Dtests.locale=pl -Dtests.timezone=America/Inuvik -Dtests.asserts=true -Dtests.file.encoding=UTF-8

[junit4:pickseed] Seed property 'tests.seed' already defined: E1D51F3E8B12E79D
   [junit4] <JUnit4> says 今日は! Master seed: E1D51F3E8B12E79D
   [junit4] Executing 1 suite with 1 JVM.
   [junit4] 
   [junit4] Started J0 PID(62227@localhost).
   [junit4] Suite: org.apache.lucene.bkdtree3d.TestGeo3DPointField
   [junit4] OK      70.2s | TestGeo3DPointField.testRandomMedium {#0 seed=[E1D51F3E8B12E79D:5C0B2896CA7784FB]}
   [junit4]   2> sie 26, 2015 4:11:15 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
   [junit4]   2> WARNING: Uncaught exception in thread: Thread[T0,5,TGRP-TestGeo3DPointField]
   [junit4]   2> java.lang.AssertionError: T0: iter=425 id=1237 docID=1237 lat=0.005231514023315527 lon=0.0034278119211296914 expected false but got: true deleted?=false
   [junit4]   2>   point1=[lat=0.005231514023315527, lon=0.0034278119211296914], iswithin=false
   [junit4]   2>   point2=[X=1.0010991445151618, Y=0.003431592678386528, Z=0.00523734247369568], iswithin=false
   [junit4]   2>   query=PointInGeo3DShapeQuery: field=point:PlanetModel: PlanetModel.WGS84 Shape: GeoCircle: {planetmodel=PlanetModel.WGS84, center=[lat=0.006204988457123483, lon=0.003379977917811208], radius=7.780831828380698E-4(0.04458088248672737)}
   [junit4]   2>    at __randomizedtesting.SeedInfo.seed([E1D51F3E8B12E79D]:0)
   [junit4]   2>    at org.junit.Assert.fail(Assert.java:93)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4._run(TestGeo3DPointField.java:632)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:521)
asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

So I tried to reproduce this in geo3d-land exclusively, and coded this:

    c = new GeoCircle(PlanetModel.WGS84,0.006204988457123483,0.003379977917811208,7.780831828380698E-4);
    p1 = new GeoPoint(PlanetModel.WGS84,0.005231514023315527,0.0034278119211296914);
    assertTrue(!c.isWithin(p1));
    xyzb = new XYZBounds();
    c.getBounds(xyzb);
    area = GeoAreaFactory.makeGeoArea(PlanetModel.WGS84, 
      xyzb.getMinimumX(), xyzb.getMaximumX(), xyzb.getMinimumY(), xyzb.getMaximumY(), xyzb.getMinimumZ(), xyzb.getMaximumZ());
    // Doesn't have to be true, but is...
    assertTrue(!area.isWithin(p1));

The exact point in question shows up as outside of even the bounds computed for the circle. So honestly I don't know how it wound up getting included? Unless, perhaps, the descent decisions were made based on the approximation?

Looking at the code to see how to delve deeper...

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Huh. Even odder, tried the "repro" line and got this:

ant test  -Dtestcase=TestGeo3DPointField -Dtests.method=testRandomMedium* -Dtests.seed=E1D51F3E8B12E79D -Dtests.multiplier=10 -Dtests.iters=5 -Dtests.slow=true -Dtests.linedocsfile=/lucenedata/hudson.enwiki.random.lines.txt.fixed -Dtests.locale=pl -Dtests.timezone=America/Inuvik -Dtests.asserts=true -Dtests.file.encoding=UTF-8

...

BUILD SUCCESSFUL
Total time: 13 minutes 14 seconds

So now I'm very puzzled, @mikemccand.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

I did a repeat run, being sure to "ant clean" first, and it still passed.

Hmm.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Grrr, also not consistently reproducing for me ... it does repro on one box but not another; odd. On differences is Java 1.8.0_51 vs 1.8.0_40 ...

I'll turn on the debugging prints and try to get more details about the BKD descent.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Ugh, I take that back: after ant clean I cannot reproduce the failure anymore ... I'll re-beast.

But one thing did occur to me: I think we may have an ob1 when we compute the cell that BKD asks geo3d to compare to the shape.

The cell is bounded by x/y/zMin,Max 32 bit values, but these are all inclusive, and because of the quantization, each of those values represents a range of values in 64 bit space, and so I think for the max values we need to do +1 before converting back to doubles?

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Here's a patch showing how I think we should fix the ob1 issue....

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Hi @mikemccand: Patch has the right idea but should deal with negative numbers reasonably also? Unless, of course, your mapping to signed doubles starts with an int value that is always positive. ;-)

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

@DaddyWri Ahh you're right, it must also subtract one from the mins when they are negative! I'll fix ...

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Here's a newish looking failure (repros even after ant clean with the previous patch I attached):

   [junit4] Started J0 PID(48425@localhost).
   [junit4] Suite: org.apache.lucene.bkdtree3d.TestGeo3DPointField
   [junit4]   2> août 27, 2015 4:16:47 AM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
   [junit4]   2> WARNING: Uncaught exception in thread: Thread[T0,5,TGRP-TestGeo3DPointField]
   [junit4]   2> java.lang.RuntimeException: java.lang.NullPointerException
   [junit4]   2>    at __randomizedtesting.SeedInfo.seed([9B02953CBA892483]:0)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:524)
   [junit4]   2> Caused by: java.lang.NullPointerException
   [junit4]   2>    at org.apache.lucene.geo3d.BaseXYZSolid.isWithin(BaseXYZSolid.java:82)
   [junit4]   2>    at org.apache.lucene.geo3d.BaseXYZSolid.isShapeInsideArea(BaseXYZSolid.java:111)
   [junit4]   2>    at org.apache.lucene.geo3d.XYZSolid.getRelationship(XYZSolid.java:267)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.PointInGeo3DShapeQuery$1.scorer(PointInGeo3DShapeQuery.java:105)
   [junit4]   2>    at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:589)
   [junit4]   2>    at org.apache.lucene.search.Weight.bulkScorer(Weight.java:135)
   [junit4]   2>    at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:69)
   [junit4]   2>    at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:69)
   [junit4]   2>    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
   [junit4]   2>    at org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:92)
   [junit4]   2>    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:425)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4._run(TestGeo3DPointField.java:587)
   [junit4]   2>    at org.apache.lucene.bkdtree3d.TestGeo3DPointField$4.run(TestGeo3DPointField.java:521)
   [junit4]   2> 
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeo3DPointField -Dtests.method=testRandomTiny -Dtests.seed=9B02953CBA892483 -Dtests.multiplier=5 -Dtests.slow=true -Dtests.linedocsfile=/lucenedata/hudson.enwiki.random.lines.txt.fixed -Dtests.locale=fr_LU -Dtests.timezone=America/Juneau -Dtests.asserts=true -Dtests.file.encoding=UTF-8
asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

This is a circle that did not initialize properly, probably due to edge effects once again. I bet the radius is miniscule. I'll chase it down.

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Yup, tiny:

   [junit4]    > Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Null path point! Shape = GeoCircle: {planetmodel=PlanetModel.SPHERE, center=[lat=-0.006204510641448213, lon=0.004660366014742108], radius=1.2622491508618621E-8(7.232154903835664E-7)}

It's still three orders of magnitude larger than the limit, but when things get that small the math gets less stable. I am tempted to catch these things during construction of the GeoCircle and throw an IllegalArgumentException whenever the math falls apart. It's unlikely that anyone in the real world, except for missile designers, would be interested in centimeter resolution anyway. ;-) What do you think?

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

+1

Let's hope missile designers don't try to use this :)

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

@mikemccand Here's a patch that throws IllegalArgumentException when we try to construct a GeoCircle but it cannot resolve what it needs. Test still fails but perhaps you can detect IllegalArgumentException and act accordingly?

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Thanks @DaddyWri, I'll commit this and fix the test ...

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

OK here's another attempt to deal with the quantization (properly handling negative numbers I think?) during BKD descent ...

asfimport commented 9 years ago

Karl Wright (@DaddyWri) (migrated from JIRA)

Looks ok, except this doesn't seem like it should be there anymore:

+                                             if (cellXMaxEnc < Integer.MAX_VALUE) {
+                                               cellXMaxEnc++;
+                                             }
+                                             if (cellYMaxEnc < Integer.MAX_VALUE) {
+                                               cellYMaxEnc++;
+                                             }
+                                             if (cellZMaxEnc < Integer.MAX_VALUE) {
+                                               cellZMaxEnc++;
+                                             }
+

It seems like the cell*MaxEnc and cell*MinEnc values are all positive integers?

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

Duh, I'll remove that old code, you're right.

It seems like the cellMaxEnc and cellMinEnc values are all positive integers?

Argh, you're right. x,y,z are always >= 0.0 right (and <= PlanetModel.getMaximumMagnitude())? So we are only using 31 bits now ... I don't like that. I'll fix the encoding to use all 32 bits.

asfimport commented 9 years ago

Michael McCandless (@mikemccand) (migrated from JIRA)

x,y,z are always >= 0.0 right

Wait, they are not :)

The planet models have the earth's center at the origin, so they span +/- 1.0 for the simple sphere (and a bit bigger for the squashed elipsoid)?

So I think we are in fact using the full 32 bit space now, and the int values are sometimes negative...