sudeep87 / uimafit

Automatically exported from code.google.com/p/uimafit
0 stars 0 forks source link

Add indexCovering() to replace ContainmentIndex #60

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Having just struggled with ContainmentIndex for a while here, I think he 
current class is really confusing. The association between the Type enum and 
what methods will actually work is not obvious. Only by reading the source 
could I figure out that DIRECT means that containedIn works, and REVERSE means 
containing works. But you don't get an error if you call one of these when you 
used the wrong enum value. Couple that with the fact that the type parameters 
are named S and U instead of something more informative like COVERING and 
COVERED, it was really hard to decipher how to use this code.

I think we should get rid of both the enum and the containedIn methods (which 
are redundant with JCasUtil.selectCovered). Or if you really think it's 
necessary to have an index for the containedIn relation, can we make two 
separate classes, e.g. CoveredAnnotationsIndex vs. CoveringAnnotationsIndex? 
They could obviously inherit all the shared code from a common superclass.

Original issue reported on code.google.com by steven.b...@gmail.com on 16 Mar 2011 at 5:16

GoogleCodeExporter commented 8 years ago
I'll think about it a bit. Right now I would say that splitting the stuff into 
two classes doesn't seem to bad an idea. It incurs additional overhead if you 
actually do need both directions though as two iterations would be required 
while with a single class only one iteration is required. 

I think having two distinct classes probably isn't necessary, because they 
would internally basically contain a map - instead there could be two different 
factory methods in (J)CasUtil. Maybe indexCovered(coveredType, coveringType) 
and indexCovering(coveringType, coveredType) returning an index object having a 
method select(instance).

Original comment by richard.eckart on 16 Mar 2011 at 5:27

GoogleCodeExporter commented 8 years ago
I like the indexCovered and indexCovering idea. That would probably solve my 
ordering confusion because select would be unambiguous.

I wonder if these should just return the Map objects they create rather than a 
custom index interface? To match the current behavior, we could create a simple 
subtype of LinkedHashMap that returns an empty collection for missing keys.

Original comment by steven.b...@gmail.com on 16 Mar 2011 at 6:21

GoogleCodeExporter commented 8 years ago
Usually I prefer encapsulation to extension. I'd prefer not subclassing 
LinkedHashMap. 
Do you require to have the full functionality of the Map interface? 
Would the map be immutable? 
If not, should modifications to the map be propagated into the CAS?

Original comment by richard.eckart on 16 Mar 2011 at 6:31

GoogleCodeExporter commented 8 years ago
Sure, you can delegate instead of extending - whichever is easier.

I'm mainly thinking that people are already familiar with the Map interface, 
and we're basically creating a Map-like object for them to use, so using the 
Map interface is natural. I could certainly imagine wanting to call keySet() or 
values().

I would assume the map is immutable, or if it's mutable, that the modifications 
would not be propagated into the CAS - just like I wouldn't expect modification 
of any other Map<Annotation, List<Annotation>> to modify the CAS (or for that 
matter, any of the List<Annotation>s returned by the other JCasUtil methods).

Original comment by steven.b...@gmail.com on 16 Mar 2011 at 7:36

GoogleCodeExporter commented 8 years ago
This gets me to wonder if we shouldn't make all collections returned by those 
methods also immutable via Collections.immutableCollection().

Original comment by richard.eckart on 16 Mar 2011 at 7:42

GoogleCodeExporter commented 8 years ago
That sounds sensible to me.

Original comment by steven.b...@gmail.com on 16 Mar 2011 at 7:52

GoogleCodeExporter commented 8 years ago
I've started looking into this and noticed now that ContainmentIndex contains 
some predicates as well. I'm not sure yet how to best keep this - have to think 
about that.

Original comment by richard.eckart on 16 Mar 2011 at 11:46

GoogleCodeExporter commented 8 years ago

Original comment by richard.eckart on 16 Mar 2011 at 11:55

GoogleCodeExporter commented 8 years ago
By predicates, you mean the remaining methods? I think they have pretty simple 
translations in terms of a Map<Annotation, Set<Annotation>>. For the one 
direction:

* containedIn(x)=Map.get(x)
* isContainedIn(x, y)=Map.get(x).contains(y)

And for the other direction:

* containing(x)=Map.get(x)
* isContainedInAny=!Map.get(x).isEmpty()

Note that we're currently missing some methods to make both directions equal: 
there's no isContaining(x, y) or containsAny(x).

Original comment by steven.b...@gmail.com on 17 Mar 2011 at 6:44

GoogleCodeExporter commented 8 years ago
I have added an indexCovering() returning a Map and added a test case.

Original comment by richard.eckart on 17 Mar 2011 at 6:05

GoogleCodeExporter commented 8 years ago
This issue is not yet closed because there is no indexCovered() yet.
Also probably the two methods missing in ContainmentIndex (per comment 9) 
should be implemented.

Original comment by richard.eckart on 18 Mar 2011 at 3:04

GoogleCodeExporter commented 8 years ago
I probably wouldn't bother implementing the two methods in comment 9 since the 
code to write them via a Map is so simple and self explanatory.

Original comment by steven.b...@gmail.com on 18 Mar 2011 at 10:51

GoogleCodeExporter commented 8 years ago
Fixed in rev. 608.

Original comment by richard.eckart on 2 May 2011 at 7:22