Anton87 / uimafit

Automatically exported from code.google.com/p/uimafit
0 stars 0 forks source link

@ConfigurationParameter - name should default to name of annotated field without class name #70

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Currently the name of an annotated field defaults to classname + "." + field. 
It should only default to "field".

My code is littered with lines like:

public static final String PARAM_SOME_PARAMETER = "someParameter";
@ConfigurationParameter(name = PARAM_SOME_PARAMETER)
private String someParameter;

It would be nice to be able write something like:

@ConfigurationParameter
private String someParameter;

AnalysisEngine ae = createPrimitive(MyAE.class, "someParameter", "someValue");

What is again the reason why the fully qualified name of the class is prefixed 
to the field?

Original issue reported on code.google.com by richard.eckart on 25 Mar 2011 at 10:12

GoogleCodeExporter commented 8 years ago
Presumably to distinguish between multiple annotators that all have, e.g. an 
"outputDirectory" parameter? Moreover, your second version with the 
"someParameter" literal looks error prone to me - you'll get a runtime error on 
a typo instead of a compile time error like you would with PARAM_SOME_PARAMETER.

That said, ClearTK is littered with code that looks like:

  @ConfigurationParameter
  private boolean someParameter;

  public static final String PARAM_SOME_PARAMETER = ConfigurationParameterFactory.createConfigurationParameterName(MyClass.class, "someParameter");

We struggled for a while to simplify that, but since the parameter name needs 
to be available statically, there didn't seem to be any way to get to something 
simpler like:

  @ConfigurationParameterName
  public static final String PARAM_SOME_PARAMETER;

Original comment by steven.b...@gmail.com on 25 Mar 2011 at 11:32

GoogleCodeExporter commented 8 years ago
Yes - the name collision is the main reason at this point.  One should not 
worry when, say, building an aggregate analysis engine whether or not two 
parameters have the same name or not.  By fully specifying the location of the 
member variable that is annotated as a configuration parameter, they will never 
collide by default.

Another reason that we had that is not nearly so important now is that when you 
are wading through a sea of descriptor files it can be difficult to remember 
where configuration parameters came from.  This is especially true when you 
dynamically load implementations of Initializable as we are fond of doing.  

I also agree with Steve that it is not good practice to use literals as you did 
in your wouldnt-it-be-nice example above.  We are fairly strict about always 
using a constant when referring to config param names.  

Original comment by phi...@ogren.info on 26 Mar 2011 at 3:21

GoogleCodeExporter commented 8 years ago
I am aware that my example is prone to runtime errors due to a typo. Please 
ignore this for the time being - not in all scenarios is it possible to access 
constant strings.

The name collision argument seems a flawed to me. You seem to assume that a 
pipeline contains only once instance of a particular component class. I do have 
pipelines that include multiple instances of a single analysis engine class 
with different configurations. With our without the class name prefix, there 
would be collisions.

I do not understand how you can have collisions at all when components are 
instantiated with e.g. createPrimitive(AE.class, "param", "value"). It is 
completely clear that param belongs to AE and to no place else. Can you make an 
example where a collision happens? 

For Inizializables the situations is similar. You cannot include multiple 
Initializables with different configurations. When I want to "forward" certain 
parameters to a strategy plugin (read Initializable), I usually define a 
parameter prefix in the host component, then strip the prefix off before 
passing the parameter values on to the strategy plugin, e.g.:

String PARAM_NAMING_PREFIX = "naming.";
createPrimitive(XWriter.class,  
XWriter.PARAM_NAMING_PREFIX+InitializableFileNamer.PARAM_EXT, ".txt");

So far I didn't have the need, but this could easily be extended to support 
multiple strategy plugin instances:

createPrimitive(XWriter.class,  
XWriter.PARAM_NAMING_PREFIX+"1."+InitializableFileNamer.PARAM_EXT, ".txt");
createPrimitive(XWriter.class,  
XWriter.PARAM_NAMING_PREFIX+"2."+InitializableFileNamer.PARAM_EXT, ".rtf");
createPrimitive(XWriter.class,  
XWriter.PARAM_NAMING_PREFIX+"3."+InitializableFileNamer.PARAM_EXT, ".doc");

Original comment by richard.eckart on 26 Mar 2011 at 3:45

GoogleCodeExporter commented 8 years ago
You may be right that it isn't so easy to have configuration parameters 
collide.  My initial frustration with unqualified parameter names was that I 
could never remember where they came from (back when I used xml descriptor 
files.)  

I guess the only thing I can think of off the top of my head is if a single 
instance of an AE instantiates two different implementations of Initializable 
and they have the same parameter names, then that would be a collision.  That 
doesn't seem so unlikely.  For example, I often create interfaces for my 
feature extractors and load several up at a time.  These things often look like 
variations of each other and so it would be reasonable that they would have the 
same names for some of their parameters (though I can't think of one off the 
top of my head.)  

Original comment by phi...@ogren.info on 26 Mar 2011 at 4:21

GoogleCodeExporter commented 8 years ago
So to summarize:

Fully qualified names could possibly avoid collisions in some extreme cases, 
but none of us actually have code where they do.

Fully qualified names are easier to match to their source, though this is less 
of a problem without XML descriptor files.

Non-qualified names allow you to write shorter code, but at the cost of getting 
a runtime error instead of a compile time error when you make a typo.

Sounds like a wash to me. Are there other arguments for either direction?

Original comment by steven.b...@gmail.com on 26 Mar 2011 at 8:38

GoogleCodeExporter commented 8 years ago
Typos can be avoided even with non-qualified names if a parameter name constant 
is defined.

Even if we serialize a XML descriptor from an aggregate we have build using 
uimaFIT, the parameters and values are set locally in each component. We do not 
support overriding parameters in an aggregate.

If you want to run UIMA from a scripting language, you're likely to accept the 
potential typos for shorter code.

Other frameworks (e.g. Spring, Java Beans) always use the non-qualified name to 
access fields. Thus, when exploring how to better integrate UIMA with 
state-of-the-art DI frameworks, non-qualified seems to be the way to go.

If a component B extends a component A and both define the same parameter, then 
the parameter definition in component B should override that in component A. 
This is a bit difficult if the parameters are "disambiguated" be the declaring 
class name.

Original comment by richard.eckart on 26 Mar 2011 at 11:41

GoogleCodeExporter commented 8 years ago
I don't think the where-does-this-parameter-come-from argument is completely 
trivial.  I have had a number of conversations with people that are excited 
about using uimaFIT that are not going to do away with descriptor files because 
they still have their place at deployment time.  In fact, that is one of the 
concerns I hear the most about uimaFIT is "can we still use descriptor files?"  
As much as we dislike them they are here to stay and will be used widely by 
uimaFIT users.  So, imagine you are a user of a system developed using uimaFIT 
that has generated descriptor files.  I suppose if the annotated parameters 
have a description, then that description will appear in the descriptor file 
with the parameter definition.  So, this removes the need to go looking for 
where that parameter is defined in the class file.  I just don't want to end up 
where I was before where I had 8 parameters, a couple Initializables, and a 
constant search to figure out where they came from so that I could remember 
what they were for.  

That said, I think Richard's point about expectations for DI frameworks is 
valid.  I'm not sure about the inheritance point.  I don't understand why a 
subclass component B would have to define the same parameter to override it.  
It seems that if a config parameter member variable in component A is exposed 
to component B, then component B can "override" it by just initializing it 
differently.  I'm not really in the habit of overriding member variables by 
declaration.  It seems like a bad idea - though I don't feel confident enough 
about this to really feel strongly about it.  

Is there any way we can have it both ways?  Maybe we could introduce an element 
in the ConfiguratonParameter that asks whether you want a short default name or 
a long one?  Of course, we still need to decide which is the default value.  A 
long name by default would maintain backwards compatibility and avoid a bunch 
of busy work for updating ClearTK configuration parameter definitions.  

Original comment by phi...@ogren.info on 27 Mar 2011 at 1:41

GoogleCodeExporter commented 8 years ago
I think issue #71 argues for having default names for ConfigurationParameter be 
the simple name.  It would be nice to have ExternalResource keys and 
ConfigurationParameters named consistently.  It doesn't necessarily have to be 
so - but I think the two annotations should mimic each other where possible.  I 
am not prepared to argue that ExternalResource keys should default to fully 
qualified member variable names!  

So, I am somewhat leaning towards having a flag in ConfigParam that allows a 
default name that is fully qualified whose value is false by default (even 
though it is a painful change for ClearTK.)  

Original comment by phi...@ogren.info on 27 Mar 2011 at 3:13

GoogleCodeExporter commented 8 years ago
I might be worth considering to rename the "key" parameter of the 
@ExternalResource to "name". I initially didn't do that, because I stuck to the 
UIMAs naming scheme.

Original comment by richard.eckart on 27 Mar 2011 at 4:27

GoogleCodeExporter commented 8 years ago

Original comment by richard.eckart on 11 Apr 2011 at 5:01

GoogleCodeExporter commented 8 years ago

Original comment by richard.eckart on 17 Apr 2011 at 1:40

GoogleCodeExporter commented 8 years ago

Original comment by richard.eckart on 7 Jan 2013 at 4:51

GoogleCodeExporter commented 8 years ago
Issue now as ASF: https://issues.apache.org/jira/browse/UIMA-2587

Original comment by richard.eckart on 18 Jan 2013 at 5:50

GoogleCodeExporter commented 8 years ago

Original comment by richard.eckart on 4 Mar 2013 at 7:54