Closed GoogleCodeExporter closed 8 years ago
Presumably to distinguish between multiple annotators that all have, e.g. an
"outputDirectory" parameter? Moreover, your second version with the
"someParameter" literal looks error prone to me - you'll get a runtime error on
a typo instead of a compile time error like you would with PARAM_SOME_PARAMETER.
That said, ClearTK is littered with code that looks like:
@ConfigurationParameter
private boolean someParameter;
public static final String PARAM_SOME_PARAMETER = ConfigurationParameterFactory.createConfigurationParameterName(MyClass.class, "someParameter");
We struggled for a while to simplify that, but since the parameter name needs
to be available statically, there didn't seem to be any way to get to something
simpler like:
@ConfigurationParameterName
public static final String PARAM_SOME_PARAMETER;
Original comment by steven.b...@gmail.com
on 25 Mar 2011 at 11:32
Yes - the name collision is the main reason at this point. One should not
worry when, say, building an aggregate analysis engine whether or not two
parameters have the same name or not. By fully specifying the location of the
member variable that is annotated as a configuration parameter, they will never
collide by default.
Another reason that we had that is not nearly so important now is that when you
are wading through a sea of descriptor files it can be difficult to remember
where configuration parameters came from. This is especially true when you
dynamically load implementations of Initializable as we are fond of doing.
I also agree with Steve that it is not good practice to use literals as you did
in your wouldnt-it-be-nice example above. We are fairly strict about always
using a constant when referring to config param names.
Original comment by phi...@ogren.info
on 26 Mar 2011 at 3:21
I am aware that my example is prone to runtime errors due to a typo. Please
ignore this for the time being - not in all scenarios is it possible to access
constant strings.
The name collision argument seems a flawed to me. You seem to assume that a
pipeline contains only once instance of a particular component class. I do have
pipelines that include multiple instances of a single analysis engine class
with different configurations. With our without the class name prefix, there
would be collisions.
I do not understand how you can have collisions at all when components are
instantiated with e.g. createPrimitive(AE.class, "param", "value"). It is
completely clear that param belongs to AE and to no place else. Can you make an
example where a collision happens?
For Inizializables the situations is similar. You cannot include multiple
Initializables with different configurations. When I want to "forward" certain
parameters to a strategy plugin (read Initializable), I usually define a
parameter prefix in the host component, then strip the prefix off before
passing the parameter values on to the strategy plugin, e.g.:
String PARAM_NAMING_PREFIX = "naming.";
createPrimitive(XWriter.class,
XWriter.PARAM_NAMING_PREFIX+InitializableFileNamer.PARAM_EXT, ".txt");
So far I didn't have the need, but this could easily be extended to support
multiple strategy plugin instances:
createPrimitive(XWriter.class,
XWriter.PARAM_NAMING_PREFIX+"1."+InitializableFileNamer.PARAM_EXT, ".txt");
createPrimitive(XWriter.class,
XWriter.PARAM_NAMING_PREFIX+"2."+InitializableFileNamer.PARAM_EXT, ".rtf");
createPrimitive(XWriter.class,
XWriter.PARAM_NAMING_PREFIX+"3."+InitializableFileNamer.PARAM_EXT, ".doc");
Original comment by richard.eckart
on 26 Mar 2011 at 3:45
You may be right that it isn't so easy to have configuration parameters
collide. My initial frustration with unqualified parameter names was that I
could never remember where they came from (back when I used xml descriptor
files.)
I guess the only thing I can think of off the top of my head is if a single
instance of an AE instantiates two different implementations of Initializable
and they have the same parameter names, then that would be a collision. That
doesn't seem so unlikely. For example, I often create interfaces for my
feature extractors and load several up at a time. These things often look like
variations of each other and so it would be reasonable that they would have the
same names for some of their parameters (though I can't think of one off the
top of my head.)
Original comment by phi...@ogren.info
on 26 Mar 2011 at 4:21
So to summarize:
Fully qualified names could possibly avoid collisions in some extreme cases,
but none of us actually have code where they do.
Fully qualified names are easier to match to their source, though this is less
of a problem without XML descriptor files.
Non-qualified names allow you to write shorter code, but at the cost of getting
a runtime error instead of a compile time error when you make a typo.
Sounds like a wash to me. Are there other arguments for either direction?
Original comment by steven.b...@gmail.com
on 26 Mar 2011 at 8:38
Typos can be avoided even with non-qualified names if a parameter name constant
is defined.
Even if we serialize a XML descriptor from an aggregate we have build using
uimaFIT, the parameters and values are set locally in each component. We do not
support overriding parameters in an aggregate.
If you want to run UIMA from a scripting language, you're likely to accept the
potential typos for shorter code.
Other frameworks (e.g. Spring, Java Beans) always use the non-qualified name to
access fields. Thus, when exploring how to better integrate UIMA with
state-of-the-art DI frameworks, non-qualified seems to be the way to go.
If a component B extends a component A and both define the same parameter, then
the parameter definition in component B should override that in component A.
This is a bit difficult if the parameters are "disambiguated" be the declaring
class name.
Original comment by richard.eckart
on 26 Mar 2011 at 11:41
I don't think the where-does-this-parameter-come-from argument is completely
trivial. I have had a number of conversations with people that are excited
about using uimaFIT that are not going to do away with descriptor files because
they still have their place at deployment time. In fact, that is one of the
concerns I hear the most about uimaFIT is "can we still use descriptor files?"
As much as we dislike them they are here to stay and will be used widely by
uimaFIT users. So, imagine you are a user of a system developed using uimaFIT
that has generated descriptor files. I suppose if the annotated parameters
have a description, then that description will appear in the descriptor file
with the parameter definition. So, this removes the need to go looking for
where that parameter is defined in the class file. I just don't want to end up
where I was before where I had 8 parameters, a couple Initializables, and a
constant search to figure out where they came from so that I could remember
what they were for.
That said, I think Richard's point about expectations for DI frameworks is
valid. I'm not sure about the inheritance point. I don't understand why a
subclass component B would have to define the same parameter to override it.
It seems that if a config parameter member variable in component A is exposed
to component B, then component B can "override" it by just initializing it
differently. I'm not really in the habit of overriding member variables by
declaration. It seems like a bad idea - though I don't feel confident enough
about this to really feel strongly about it.
Is there any way we can have it both ways? Maybe we could introduce an element
in the ConfiguratonParameter that asks whether you want a short default name or
a long one? Of course, we still need to decide which is the default value. A
long name by default would maintain backwards compatibility and avoid a bunch
of busy work for updating ClearTK configuration parameter definitions.
Original comment by phi...@ogren.info
on 27 Mar 2011 at 1:41
I think issue #71 argues for having default names for ConfigurationParameter be
the simple name. It would be nice to have ExternalResource keys and
ConfigurationParameters named consistently. It doesn't necessarily have to be
so - but I think the two annotations should mimic each other where possible. I
am not prepared to argue that ExternalResource keys should default to fully
qualified member variable names!
So, I am somewhat leaning towards having a flag in ConfigParam that allows a
default name that is fully qualified whose value is false by default (even
though it is a painful change for ClearTK.)
Original comment by phi...@ogren.info
on 27 Mar 2011 at 3:13
I might be worth considering to rename the "key" parameter of the
@ExternalResource to "name". I initially didn't do that, because I stuck to the
UIMAs naming scheme.
Original comment by richard.eckart
on 27 Mar 2011 at 4:27
Original comment by richard.eckart
on 11 Apr 2011 at 5:01
Original comment by richard.eckart
on 17 Apr 2011 at 1:40
Original comment by richard.eckart
on 7 Jan 2013 at 4:51
Issue now as ASF: https://issues.apache.org/jira/browse/UIMA-2587
Original comment by richard.eckart
on 18 Jan 2013 at 5:50
Original comment by richard.eckart
on 4 Mar 2013 at 7:54
Original issue reported on code.google.com by
richard.eckart
on 25 Mar 2011 at 10:12