sudeep87 / uimafit

Automatically exported from code.google.com/p/uimafit
0 stars 0 forks source link

Support creating custom indexes #54

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Looking a bit into this idea just raised in the user's group, I think it makes 
sense and is pretty straight forward to implement. I would add two new features 
to support indexes.

1) Add a new annotation @IndexDescription with a parameter "location" that can 
be used to point to a descriptor in the classpath. Contrary to my suggestion on 
the mailing list, I'd suggest to support only the classpath lookup (no 
file-system lookup) and thus the "classpath:" prefix in the location would be 
unnecessary.

   @IndexDescription(location="desc.indexes.MyIndex")
   class MyAE extends JCasAnnotator_ImplBase

2) Extend the automatic type system detection by automatic index description 
detection. Since index descriptions are like type descriptions ingredients that 
got into the setting up of CASes, it seems a reasonable extension.

I would also suggest implementing the same two measures for type priorities. 

Original issue reported on code.google.com by richard.eckart on 27 Jan 2011 at 8:03

GoogleCodeExporter commented 8 years ago

Original comment by richard.eckart on 27 Jan 2011 at 8:03

GoogleCodeExporter commented 8 years ago

Original comment by richard.eckart on 27 Jan 2011 at 8:04

GoogleCodeExporter commented 8 years ago
This all sounds perfectly reasonable to me.  +1

Original comment by phi...@ogren.info on 27 Jan 2011 at 11:55

GoogleCodeExporter commented 8 years ago

Original comment by richard.eckart on 17 Mar 2011 at 8:31

GoogleCodeExporter commented 8 years ago
After thinking a while about it, I've made up my mind to first implement 
annotations by which an index can be defined. The syntax and names reflect the 
involved UIMA classes:

    @FsIndexCollection(fsIndexes = {
            @FsIndex(label="index1", type=Token.class, kind=FsIndex.KIND_SORTED, keys = {
                @FsIndexKey(featureName="begin", comparator=FsIndexKey.REVERSE_STANDARD_COMPARE),
                @FsIndexKey(featureName="end", comparator=FsIndexKey.STANDARD_COMPARE)
            }),
            @FsIndex(label="index2", type=Sentence.class, kind=FsIndex.KIND_SET, keys = {
                @FsIndexKey(featureName="begin", comparator=FsIndexKey.STANDARD_COMPARE)
            })
    })

While implementing the test case, I noticed that in order to support custom 
indexes, we must not only be able to create index definitions and process 
during descriptor creation, but we probably should also have some way to select 
the index to use when using the select() methods. I'll open a separate issue 
for this though.

Original comment by richard.eckart on 18 Mar 2011 at 12:38

GoogleCodeExporter commented 8 years ago
Implemented:

1) Ability to define indexes as annotations on a component. This can come in 
helpful if a component or family of components inheriting from a common 
ancestor class expect a certain index to be present.

2) Ability to scan for index definitions automatically (cf. 
http://code.google.com/p/uimafit/wiki/TypeDescriptorDetection). This is helpful 
when UIMA is embedded into a framework or application that provides types, 
indexes or type priorities.

3) Ability to easily load an index description from the classpath (like. 
createTypeSystemDescription(String... descriptorNames))

4) Ability to easily load an index description from an URL (like. 
createTypeSystemDescriptionFromPath(String... descriptorURIs))

5) Added parameter to specify custom indexes in the lowest-level methods of 
AnalysisEngineFactory and CollectionReaderFactory. If no indexes were 
specified, indexes are tried to be created from the component annotations.

I think that the @IndexDescription() annotation initially suggested is not 
needed as that should be covered by 3, 4 and 5.

Original comment by richard.eckart on 18 Mar 2011 at 2:57

GoogleCodeExporter commented 8 years ago
See issue 65.

Original comment by richard.eckart on 18 Mar 2011 at 3:33