xiaoyangren / dkpro-core-asl

Automatically exported from code.google.com/p/dkpro-core-asl
0 stars 0 forks source link

Improved mechanism for loading models and mappings #40

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Currently DKPro TreeTagger supports auto-lookup of model files. It looks up and
loads the appropriate language model automatically according to the document
language. All other DKPro analysis engines (AEs) doesn't possess this ability
yet.

Dive into DKPro TreeTagger and learn how it does such auto-lookup. Can this
mechanism be encapsulated into ExternalResource? Goal is to let AE
automatically gain this auto-lookup feature, when such an object is passed in
in the parameter for model file location.

Furthermore, specific default paths should be configurable via property files.

Lastly, can it load concrete resources lazily? Meaning to load the resource the
moment it is first used. (Good starting point: ExternalResourceFactory of
UIMAFit, line 220)

For the lazy-loeading resources, have a look at the class ParametrizedResource
in org.uimafit.factory.ExternalResourceFactoryTest.

There is one more aspect to this issue: tags produced by the TreeTagger or
other analysis components do not directly correspond to UIMA types. We usually
have a generic base type, e.g. POS for Part-of-Speech annotations and more
specific subtypes, e.g. V for verbs, N for nouns, etc. The same for parsers or
named entity recognition. The generic model resource should also have some
method getUimaType(String tag) were you pass in a tag and it retuns a UIMA type
to use for the annotation. See
de.tudarmstadt.ukp.dkpro.core.treetagger.TreeTaggerTT4JBase.getTagType(DKProMode
l,
String, TypeSystem) for how this is done in the TreeTagger component.

Original issue reported on code.google.com by richard.eckart on 3 Oct 2011 at 7:19

GoogleCodeExporter commented 9 years ago
50% done.

Encapsulated auto-lookup mechanism in AutoResourceResolver.

Specific paths can be configured in Java, which can also be overridden at 
runtime by UIMA parameters.

Original comment by s.y...@ishuo.de on 3 Jan 2012 at 2:34

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 8 Feb 2012 at 10:51

GoogleCodeExporter commented 9 years ago
Changed the title to reflect a reorientation in this task. For the time being 
we no longer try to model this using an external resource, but rather first try 
to harmonize the model/mapping loading across components.

Original comment by richard.eckart on 8 May 2012 at 6:11

GoogleCodeExporter commented 9 years ago
This works pretty well now for POS tags in many components. For furhter 
enhancements, separate bugs will be opened.

Original comment by richard.eckart on 1 Jul 2012 at 6:38