Closed GoogleCodeExporter closed 8 years ago
+1 for moving them into a common package. How about moving them into the
component package. Thinking a bit that seems the most logical place for me and
it is where they are mainly used.
As for renaming - I like to think that both are conceptually quite similar to
Spring's BeanPostProcessors. For that reason I would suggest to rename them to
ConfigurationParameterProcessor and ExternalResourceProcessor.
Original comment by richard.eckart
on 10 Jun 2010 at 4:28
I like processor better since its a nominalization. One reason I like an
"injection" package is because its the I in FIT. But I could go either way on
this one.
Original comment by pvogren@gmail.com
on 10 Jun 2010 at 4:43
I think its fine to leave dependency injection as a concept on the
methodological level and have "processor" as a name for a concept in the
concrete implementation. But I suppose I mainly like it because it makes it
easier to draw parallels to the nomenclature world set up in the Spring
framework - makes it easier to mentally map between the two.
Original comment by richard.eckart
on 10 Jun 2010 at 4:48
I have no such connection with Spring and absent that I think "processor" is a
pretty meaningless word. Again, I'm not against it and the working suggestions
(ConfigurationParameterProcessor and ExternalResourceProcessor) are much
clearer than we have now.
One reason I created this ticket is because Chris couldn't find the
functionality provided by them. I am concerned that if they are called
processor and mixed in with the factories, then they may be harder to find.
So, I would vote to put them in a separate package if we can come up with a
name for it ("injection"?) Another reason another package might be appropriate
is because I actually had a different intuition about which existing package
they might go in - component.
I will cc Chris and see if he has an opinion.
Original comment by pvogren@gmail.com
on 10 Jun 2010 at 7:51
I wouldn't put them with the factories either, but rather with the components
in the components package or possible in a processor sub-package.
We could also call them "initializers", but I am not sure if that gives a
conceptual clash with the new initializable interface - maybe not. After all,
the main use case for those classes is to be called from within a components
initialize() method.
Original comment by richard.eckart
on 10 Jun 2010 at 8:03
Thanks Philip. I think a combination of a bit more patience on my part and a
roadmap to uimaFIT would help make it easier to find things.
Regarding the name, I don't know that I have much to add that doesn't slide
down the path of defining what OO is, what it means to naming and why I loathe
classes full of static functions with names that end in "util", "manager" or
"processor". It shows a functional, not OO view.
I like that InitializeUtil is used to extend the JCasAnnotator_ImplBase.
(though I would prefer a different name since I don't always scan imports)
These two classes are about setting up the metadata for an annotator, and in my
mind, in an OO setting, I look for functions, packages and their names in the
taxonomy of things...of the data, rather than the functions. Often this means
extending existing classes and in Java we don't have either multiple or private
inheritance, so you create mixins like InitializeUtil. They're mostly
functions, so what are you gonna do? If you lengthened the name to answer the
question "Initialize *what*?" you might name it InitializeParametersUtil.
ExternalResourceUtil is vague on the functional
end....InitializeExternalResourceUtil?
from the code:
* Configure a component from the given context.
public static <T> void configure(UimaContext aContext, T object)
Why isn't configure() a function on the component T class?
Original comment by croeder6000
on 10 Jun 2010 at 8:38
Regarding the code snipped: that is because Java does support neither mixins
nor multiple inheritance. We'd have to re-implement a bunch of things that
happen on the way from the UIMA component interfaces way down to when they
become e.g. JCasAnnotator_ImplBase in order to have a common base class from
which we could inherit a configure() method.
The current approach resembles a minimal invasive scenario to bring in new
functionality - I believe none of us could come up with a more elegant
solution, so we arrived at the same idea when looking at the issue.
Taking on the issue from a slightly different point of view (I like to think of
it as the Spring point of view) is this: like Spring, UIMA is a component
framework. A UIMA component is a bit like a Java bean (used by Spring as
components), except that it cannot be configured using getters and setters (the
traditional attack vector for dependency injection), but needs to be configured
via the initialize() method due to how UIMA is constructed. This can be
compared to constructor-based dependency injection. Now, when Spring sets up a
context, it initializes the Java beans defined in the context. After they are
initialized, all registered BeanPostProcessors are applied to them in order to
do dependency injection, evaluate annotations, apply aspect and the like.
Transferring the concept to uimaFIT, we currently have two such processors -
InitializeUtil and ExternalResourceConfigurator- and we do not (like Spring)
provide a framework for adding more of them. So in this point of view the
BeanProcessors are a mechanism provided by the IOC container responsible for
the dependency injection.
In our case UIMA is actually the IOC container. However, it does not provide a
lifecycle callback mechanism which would allow us to hook into the component
creation lifecycle and execute our post processors after the component has been
instantiated - the only way to do this is to hook into the initialize() method
of components. This is why I have created the JCasAnnotator_ImplBase class and
friends. Since our processors are stateless, they are realized as static
methods. We could also have them as proper classes and call e.g. new
ExternalResourceProcessor().process(this). Mixing them into the base classes is
not possible in Java and in most cases not desirable, since we'd only add new
methods to the component classes that would never be used outside the
initialize() method - we'd only pollute the auto-completion feature of IDEs.
One of the charms of uimaFIT is that is full of static methods. It makes using
it very straight forward in the majority of cases. It kind of provide those
constructors, that should have already been provided by UIMA and constructors
are static - as well as factory methods may be. Static methods may also prove a
draw back once we wish to work more with the UimaContext class - I am not sure
of this yet. The new InitializableFactory may address this issue.
Maybe this clears up a bit why things are as they are. I suppose I would prefer
if we did not have only static methods, but rather a set of factories. But if'd
spread things across multiple factories - as we do now - we'd end up in a very
entangled place. We'd probably have to put everything into one factory. But
anyway, we can't really refactor everything to be non-static anymore. That
option is long gone.
Original comment by richard.eckart
on 10 Jun 2010 at 9:11
thanks for a clear explanation. As someone who spends as little as 1/3 of his
time on UIMA, having a good base class to start with make life simple. Much as
I'd like to, I don't have the time to read more UIMA and uimaFIT code. While I
see the value in XML descriptor files, I loathe writing them as much as Philip.
I'm working towards writing or finding an annotator base that lets me:
- use annotated parameters and not have to write code to "mine" the context to
initialize them as in our paper from last summer (done in
uimafit.component.JCasAnnotator_ImpleBase.initailize())
- use those same annotated parameters to have code that creates the XML
descriptor file. I create a main() in the annotator's code that creates the
file when run.
public void generateXMLDescriptorFile() throws IOException, SAXException {
String className = myClass.getName();
className = className.substring(className.lastIndexOf(".") + 1) + ".xml";
File file = new File(className);
AnalysisEngineDescription aed = getPrimitiveDescription();
aed.toXML(new java.io.FileWriter(file));
}
public AnalysisEngineDescription getPrimitiveDescription() {
AnalysisEngineDescription aed =null;
TypeSystemDescription typeSystem =
TypeSystemDescriptionFactory.createTypeSystemDescription(
"typeSystem.CCPTypeSystem");
ConfigurationData configData =
ConfigurationParameterFactory.createConfigurationData(myClass);
try {
aed = AnalysisEngineFactory.createPrimitiveDescription(
myClass, typeSystem, configData);
- create an annotator for use in a uimaFIT pipeline, getting it's default
values from the annotations and specific, overriding values from the function
call
AnalysisEngine createAnalysisEngineWithDefaultValues(
List<String> resourceNames,
List<String> resourceValues)
throws ResourceInitializationException, InvalidXMLException {
AnalysisEngineDescription aed =
getPrimitiveDescription();
addExternalResourcesToAED(aed, resourceNames, resourceValues);
AnalysisEngine testAE =
AnalysisEngineFactory.createPrimitive(aed);
return testAE;
- include resources in that initialization
void addExternalResourcesToAED(AnalysisEngineDescription aed,
List<String> resourceNames,
List<String> resourceValues)
throws InvalidXMLException {
assert(resourceNames.size() == resourceValues.size());
Collection<ExternalResourceDependency> deps = new ArrayList<ExternalResourceDependency>();
for (String s : resourceNames) {
ExternalResourceDependency localFileDependency =
ExternalResourceFactory.createExternalResourceDependency(
s, Resource.class, false);
deps.add(localFileDependency);
}
aed.setExternalResourceDependencies(deps.toArray(new ExternalResourceDependency[deps.size()]));
for (int i=0; i< resourceNames.size(); i++) {
ExternalResourceFactory.bindResource(aed, resourceNames.get(i), resourceValues.get(i));
}
}
Is there room for this much in uimafit.component.JCasAnnotator_ImpleBase?
Original comment by croeder6000
on 10 Jun 2010 at 10:31
It's unforunate that this message board does not support syntax highlightning.
I'll reply to this in parts.
--
- use those same annotated parameters to have code that creates the XML
descriptor file. I create a main() in the annotator's code that creates the
file when run.
public void generateXMLDescriptorFile() throws IOException, SAXException {
String className = myClass.getName();
className = className.substring(className.lastIndexOf(".") + 1) + ".xml";
File file = new File(className);
AnalysisEngineDescription aed = getPrimitiveDescription();
aed.toXML(new java.io.FileWriter(file));
}
public AnalysisEngineDescription getPrimitiveDescription() {
AnalysisEngineDescription aed =null;
TypeSystemDescription typeSystem =
TypeSystemDescriptionFactory.createTypeSystemDescription(
"typeSystem.CCPTypeSystem");
ConfigurationData configData =
ConfigurationParameterFactory.createConfigurationData(myClass);
try {
aed = AnalysisEngineFactory.createPrimitiveDescription(
myClass, typeSystem, configData);
--
You should use static imports such as:
import static ...uimafit.factory.AnalysisEngineFactory.*;
This will make the static methods "appear" immediately in your class - almost
if it was a mixin!
Your two methods can be shorter. I even wonder if you really want a method to
generate a descriptor file - maybe for properly closing the writer - not sure
if toXML does that.
public void generateXMLDescriptorFile(Class aClass) throws IOException,
SAXException {
aed.toXML(new java.io.FileWriter(aClass.getSimpleName()+".xml));
}
public AnalysisEngineDescription getPrimitiveDescription() {
TypeSystemDescription tsd = createTypeSystemDescription("typeSystem.CCPTypeSystem");
return createPrimitiveDescription(Annotator.class, tsd,
Annotator.PARAM_X, "value X",
Annotator.PARAM_Y, "value Y",
...);
}
Anyway, I would suggest to have that as a utility method and not inside every
component. The annotations in the component should contain all the necessary
code and all the default values to derive a default descriptor. So my
suggestion would be to write something like this when you need it - not inside
the Annotator:
Class aClass = Annotator.class;
createPrimitiveDescription(aClass,
createTypeSystemDescription("typeSystem.CCPTypeSystem")).
toXML(new FileWriter(aClass.getSimpleName()+".xml"));
Did I overlook anything?
Original comment by richard.eckart
on 10 Jun 2010 at 10:50
- create an annotator for use in a uimaFIT pipeline, getting it's default
values from the annotations and specific, overriding values from the function
call
AnalysisEngine createAnalysisEngineWithDefaultValues(
List<String> resourceNames,
List<String> resourceValues)
throws ResourceInitializationException, InvalidXMLException {
AnalysisEngineDescription aed =
getPrimitiveDescription();
addExternalResourcesToAED(aed, resourceNames, resourceValues);
AnalysisEngine testAE =
AnalysisEngineFactory.createPrimitive(aed);
return testAE;
--
It seems to me that your code is trying to work around the fact that
ExternalResourceFactory.bindResource() does not work unless you either prepare
the AED using ExternalResourceFactory.createExternalResourceDependency() OR the
component as @ExternalResource annotations. In your case, you don't seem to
have the annotations in the component. I think the best way would be to add
them - if the component source code is under your control. It should make your
code much simpler. When you get an AED from a properly annotated component
using e.g. createPrimitiveDescription() all you should need to do is call
bindResource() to get your external resources bound.
The code should look like this:
TypeSystemDescription tsd = ...;
AnalysisEngineDescription aed = createPrimitiveDescription(Annotator.class,
tsd, PARAMETERS...);
If you code your external resources and components in the right way, something
like this should do to bind an external resourcem e.g. a hypothetical
dictionary resource:
bindResource(aed, DictionaryResource.class, DictionaryResource.PARAM_FILE,
"path/to/some/file");
In your UIMA annotator component you'd have such lines:
@ExternalResource
DictionaryResource dictionary;
The more advanced functions of the ExternalResourceFactory and the
@ExternalResource annotations are only needed when you want to have multiple
resources of the same type in your component or if you need to work with UIMA
components that do not use the @ExternalResource annotation.
Original comment by richard.eckart
on 10 Jun 2010 at 11:10
All in all, I would put nothing of your suggested code into an annotator class
and also not into one or all of the base classes. The UIMA component classes
(readers, annotators) should contain no code at all, other than what is needed
for doing the actual processing.
Original comment by richard.eckart
on 10 Jun 2010 at 11:14
One more: if you are using Eclipse, I would like to point out a very useful
feature:
1) Open the preferences and navigate to Java/Editor/Content Assist/Favorites
2) press "new type" and browse to "org.uimafit.factory.AnalysisEngineFactory"
3) ok the browse and the new-type dialog
The static methods from the AnalysisEngineFactory will from now on be available
in the auto-completion feature (pressing CTRL-space in the source code editor).
You can add any classes with static methods here. You can also choose to add
individual methods using "new member". Eclipse will automatically generate the
static imports when you use the auto-complete feature.
This saves me a lot of writing and makes static imports really neat to use.
It's kind of a poor-mans workaround to mixins ;) I usually have a couple of
Apache commons classes and uimaFIT classes in that list.
Original comment by richard.eckart
on 10 Jun 2010 at 11:28
Interesting discussion - I don't have anything substantive to add. Chris, I
was only hoping for a little help with naming. It's too bad this discussion
will get buried in an unrelated issue - please consider posting to
uimafit-users list next time.
I like the suggestion of using "initializer". How about the following:
org.uimafit.component.initialize.ConfigurationParameterInitializer
org.uimafit.component.initialize.ExternalResourceInitializer
There may be a little overlap with the Initializable but I think confusion can
be avoided by adding a little more javadoc to that interface.
Original comment by pvogren@gmail.com
on 11 Jun 2010 at 4:34
Sounds good to me
Original comment by richard.eckart
on 11 Jun 2010 at 8:22
I made the name changes. I called the methods
initializeConfigurationParameters and initializeExternalResources. I also
cleaned up the javadoc for initializeConfigurationParameters which was a little
out of date.
Original comment by pvogren@gmail.com
on 11 Jun 2010 at 9:10
Original issue reported on code.google.com by
pvogren@gmail.com
on 10 Jun 2010 at 4:16