google-code-export / uimafit

Automatically exported from code.google.com/p/uimafit
2 stars 1 forks source link

ResourceMetaData annotation #9

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Philip:
I would like to add the identification information from the descriptor to
either a new annotation or to AnalysisComponent.  I am curious about the
name for this annotation.  In the descriptor file the three items that you
can specify with this annotation are labeled "Runtime Information" in the
overview tab and are called "operationalProperties" in the xml.  This makes
me think that we should rename "AnalysisComponent" to
"AnalysisEngineOperationalProperties".  

As for name, vendor, description, and vendor information it seems
reasonable to put these in the same annotation or create a new annotation.
 The xml tag for these items is analysisEngineMetaData which would be a
suitable annotation name.  

Any thoughts on where these four items should go and what the name of
AnalysisComponent should be?

Richard:
I would prefer keeping the AnalysisComponent annotation as is or if need be
rename it OperationalProperties, 
but not AnalysisEngineOperationalProperties.

As for "documentation" annotations - I don't like them. I think
documentation should be done only in JavaDoc 
and not duplicated in annotations. I would prefer if static descriptors
(the ones you would ship e.g. with ClearTK) 
be generated using a doclet which can access annotations and JavaDoc alike. 

Philip:
You make a good point about documentation.  I don't have a lot of
experience with
annotations - but this seems like a good principle to follow.  However,
generating
descriptor files is *ridiculously* easy given a description object because
they all
have a toXML() method.  I don't know how a doclet would fit into this - but
it sounds
more complicated than calling aed.toXML().  

I have been working (not very hard) on some uutuc examples that directly
mirror the
examples in the uimaj-examples project for the tutorial.  I am basically
taking code
right out of their and uutuc-ifying it.  I have been comparing differences
between
the generated descriptor and the descriptors they provide.  This is why I
added a
TypeCapability annotation this afternoon and is why I would like to add
identification informaion.  Next up is to simplify adding resources -
between what
you've added and some code that Chris gave me - I think this should be
straightforward.  

I find "AnalysisComponent" confusing and am not sure where it came from.  I
think
OperationalProperties is a much better name.

Richard:
The name AnalysisComponent was an inspiration from JPA where entities are
annotated with the @Entity 
annotation - and I wanted an annotation that identifies a class as a
component in a similar manner. The 
operational properties ended up in this annotation because they are
actually properties of the analysis 
component and each of them can only occur once per component. So having
them as attributes of the 
AnalysisComponent seemed straight forward to me. I needed in particular the
allowMultipleDeployment property  
to control how annotators are deployed in CPEs.

Richard:
We have only minor progress towards a doclet, but it's not off the table
yet. Nothing ready to release though. 
It idea is that the doclet is run as part of a Maven build and generates
descriptors including documentation, 
author and version information. It is not meant to generate descriptors at
runtime - since obviously the 
source code would not be available at at that time.

Codewise the doclet should be pretty much as straight forward as using
UUTUC as is. A descriptor would be 
generated using the known methods and then augmented with documentation
etc. from the JavaDoc and then 
toXml would be called. So much at least for the theory. It might be I am
totally on the wrong path ;)

I plan on doing a new release of our stuff in the foreseeable future and
consider getting back at the doclet for  
that. So far no date is set in stone though.

Original issue reported on code.google.com by pvogren@gmail.com on 1 May 2010 at 3:05

GoogleCodeExporter commented 9 years ago

Original comment by pvogren@gmail.com on 1 May 2010 at 5:37

GoogleCodeExporter commented 9 years ago
Ok.  I think my brain was a little foggy when I wrote this up before.  I think 
the best thing to do is to change the name to OperationalParameters since that 
is the name of the interface that defines this information.  

see 
http://uima.apache.org/downloads/releaseDocs/2.3.0-incubating/docs/api/org/apach
e/uima/resource/metadata/OperationalProperties.html

It occurs to me just now that our annotations really ought to directly mirror 
the corresponding interfaces that they are "describing".  Our current naming is 
a bit haphazard and you can imagine that there could be many more annotations 
the might be required.  So, we should probably tighten this down a bit before 
it gets too confusing.  This makes me think that ExternalResource is also 
misnamed. I'm not sure if there is always going to be a clear parallel.  It 
might make sense to revisit the names of SofaCapability and TypeCapability.   

I know you like to keep the "documentation" information in the javadocs - but 
it just seems so straightforward to add a ResourceMetaData annotation 
corresponding to org.apache.uima.resource.metadata.ResourceMetaData and be done 
with it.  The notion of adding a doclet for these items just because we feel 
like "vendor" is "documentation" and e.g. "capabilities" is not so that we can 
preserve the "documentation only in javadocs" mandate seems a bit arbitrary.  
All this complaining is in part due to my total lack of experience with 
doclets.  However, the UIMA folks don't have any problem with putting all these 
data into a single descriptor file - so I don't see why we shouldn't treat them 
all consistently (i.e. by using annotations as we have been.)  

Original comment by pvogren@gmail.com on 11 Jun 2010 at 10:51

GoogleCodeExporter commented 9 years ago
I think you meant to say "OperationalProperties" not "OperationalParameters".

It may be a good idea to directly mirror the interfaces - but I feel it's not 
necessarily "nice". In that case, revisiting the names of SofaCabability 
andTypeCapability may not be enough - I suppose they'd have be be merged into a 
single annotation. I am not sure if directly mirroring the TypeOrFeature 
interfaces as an annotation is "nice". I guess I'd prefer having a Type and a 
Feature annotation - though that again would be against the principle. I'd also 
prefer to have the external resource annotation be named ExternalResource and 
not ExternalResourceDependency - the second is just more verbose, but does not 
add anything.

So I am a bit ambivalent about making it a principle to mirror the UIMA 
interfaces directly - I'd prefer a statement like "staying close to the UIMA 
interfaces but being open to find names or structures that work better as 
annotations."

As for the ResourceMetaData - there are some fields which I would consider to 
add as annotations and some that I would clearly realize as JavaDoc. E.g. all 
fields which basically contain natural language descriptions, e.g. description 
and copyright should be JavaDoc. Name, UUID and version may be good to have as 
annotatons. Imagine if people really used an annotation to place the 
description or copyright/license of a component, possibly with embedded HTML or 
formatting something like that - I think that would be horrible. JavaDoc has 
support for such things and JavaDoc comments can be handled by the IDE in such 
a way that they do not clutter the code too much.

I don't want to veto putting in description fields or a ResourceMetaData 
annotation into uimaFIT, but if asked, I would discourage using it for 
documentation purposes. I acknowledge that it may be a viable solution for you 
if you choose to live with possibly verbose or duplicate documentation 
cluttering your source files.

Original comment by richard.eckart on 13 Jun 2010 at 7:47

GoogleCodeExporter commented 9 years ago
yeah - I meant "mirror the interfaces" to be a rule of thumb.  Switching 
analysis component to OperationalProperties seems like an easy one and 
SofaCapability and TypeCapability a little murky.  I'm not sure what a 
TypeOfFeature annotation would be used for - but I agree that it makes a 
terrible name.  

My main concern about the proposed ResourceMetaData annotation is that it will 
get deprecated by a better solution.  I have no problem with a doclet based 
solution if it doesn't complicate life for people and it reduces the potential 
for code clutter.  I am not likely to implement it because this is not 
something that I feel strongly about having.  Frankly, I'm not that concerned 
with having either solution but am willing to implement the ResourceMetaData 
annotation approach because it seems very straightforward to me and is 
consistent with how we've been doing things so far.  I'm not trying to be rude 
- it's just that "doclet" is one of those words that make my eyes glaze over 
(based on no real knowledge or even basic intuition about them.)  

I agree that putting the full text of e.g. the ASL or BSD into an element of an 
annotation would be pretty nasty.  We actually have a script that generates 
descriptor files that sets the description to our copyright and license 
statement.  So, it's in one spot and doesn't clutter any of the components.  I 
think this would be a reasonable workaround that we can document with the 
proposed ResourceMetaData annotation - i.e. there would be a description 
element but does not need to be considered the only way of setting the 
description.   

Original comment by pvogren@gmail.com on 15 Jun 2010 at 3:02

GoogleCodeExporter commented 9 years ago
I renamed AnalysisComponent to OperationalProperties

Original comment by pvogren@gmail.com on 16 Jun 2010 at 3:46

GoogleCodeExporter commented 9 years ago
just fixing a misspelling in the summary.

Original comment by pvogren@gmail.com on 20 Jun 2010 at 3:46

GoogleCodeExporter commented 9 years ago
I would like to close this issue if possible.  I still like the idea of a 
ResourceMetaData annotation with the understanding that it is only one way to 
e.g. add license information to an analysis engine.  

It occurs to me that this is no longer a 1.0 milestone issue since the proposed 
change is additive.  

Original comment by pvogren@gmail.com on 2 Jul 2010 at 4:07

GoogleCodeExporter commented 9 years ago
I think it is more convenient to have a number of annotations for different 
aspects of what makes up the resource meta data. One annotation being the 
@OperationalProperties, another being the @IndexDescription and 
@TypePrioritiesDescription annotations suggested in issue 54. Others could be 
@License, @Description or such. All these apply at the level of the 
class/processing resource.

Original comment by richard.eckart on 27 Jan 2011 at 8:07

GoogleCodeExporter commented 9 years ago
This is all fine with me.  

Original comment by phi...@ogren.info on 27 Jan 2011 at 11:58

GoogleCodeExporter commented 9 years ago
An @Description annotation would be much appreciated, may I submit a patch for 
it:

1) @Description annotaition
2) update FlowControllerFactory.createFlowControllerDescription(...)
3) anything else?

Best, Renaud

Original comment by renaud.r...@gmail.com on 7 Feb 2012 at 8:48

GoogleCodeExporter commented 9 years ago
That would need to be added to all factories.

Original comment by richard.eckart on 7 Feb 2012 at 9:07

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 7 Jan 2013 at 4:51

GoogleCodeExporter commented 9 years ago
Whops, I didn't even remember this issue. This has been resolved now at Apache.

https://issues.apache.org/jira/browse/UIMA-2607

Original comment by richard.eckart on 4 Mar 2013 at 7:53

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 4 Mar 2013 at 7:53