Closed Horsmann closed 8 years ago
ok. I it is not urgent. Things are working and you don't notice the issue unless you look into the text files.
@reckart Did you had time to think about this issue?
@Horsmann not really. But I'm happy to accept pull requests or even provide commit rights ;)
@reckart I think I will need a more pointers to address this issue. Which class is dealing at the moment with the discriminables. I do not really find a spot where I could start. Any suggestions ?
The method that turns an object into a string which is used in the discriminators file is this one: org.dkpro.lab.Util.toString(Object)
How is your idea with the conversionService supposed to work. In the initialization method of the Task(?) all expected
types are set to a fixed mapping function something that would call getDiscriminatorValue
or is this something the user is supposed to by hand?
By the timing I have to have all information in Lab available by the time analyze(Class<?> aClazz, Class<? extends Annotation> aAnnotation, Map<String, String> props)
is called. My current hack is called way to late.
Roughly speaking I would want to move the content of the analyze
method somewhere in the initialization phase - this is probably where it should happen?
(Trying to remember)... I think my idea was that the conversion service would be part of the Lab instance, not of a task, e.g.
Lab lab = Lab.getInstance();
lab.getConversionService().registerDiscriminable(...);
But instead of always having to go through the static Lab.getInstance()
, within a task the conversion service should be obtained from the task context. Registering a conversion should probably happen in the same place where the Lab instance is initially obtained and where Lab.run(...) is called. The service would be created through the context.xml - just as the other services (e.g. lifecycleservice, etc.).
For a more sophisticated solution, I suppose it would be possible to instantiate a Spring Conversion service to handle that... or at least be inspired of how it works. - Btw. that is what uimaFIT is currently using for parameter value conversion.
Do I understand it right that I would have to manually call this registerDiscriminable
for every project/experiment? Essentially before I call Lab...run()
I would have to know that I have to set those global-overrides for the train/test reader?!
This is even more ugly than wrapping the CollectionReaderFactory
with a TcCollectionReaderFactory
which returns only CRDs
where the DynamicProxy
is already set.
This would allow a user still to ignore the TcCollectionReaderFactory
and use the normal CollectionReaderFactory
method version resulting in status quo.
The discriminables would have to be registered before the first batch task runs. So I would see three options:
I agree that the last option is ugly, but IMHO it would be the first step on top of which one or both others could be implemented.
@reckart
I am drafting such a service atm defining it in the context.xml
and loading it as service as you suggest. So far I understand things here accessing this conversionService
then does require a TaskContext
.
Lab does not have such a context when I would need it for accessing the textual information.
For fixing the problem of this issue, I need in the method protected void analyze(Class<?> aClazz, Class<? extends Annotation> aAnnotation, Map<String, String> props)
which is located TaskBase
access to this service.
How do I access such a service from within the TaskBase
?
@reckart I tried to hack something and bumped into yet another problem.
The discriminators are stored in a Map<String,String>
which becomes a problem for the CollectionReaderDescription
s. When I hack in my conversion
the endlessly verbose description text of the CRD
s cause problems with some regEx
checks in ImportUtils
in the method matchConstraints(Map<String, String> aDiscriminators, Map<String, String> aConstraints, boolean aStrict)
name = org.dkpro.tc.api.type.TextClassificationTarget
supertypeName = uima.tcas.Annotation
}
vendor = NULL
version = NULL
vendor = DKPro Core Project
version = 1.9.0-SNAPSHOT
resourceManagerConfiguration = NULL
$
^
at java.util.regex.Pattern.error(Pattern.java:1955)
at java.util.regex.Pattern.closure(Pattern.java:3141)
at java.util.regex.Pattern.sequence(Pattern.java:2134)
at java.util.regex.Pattern.expr(Pattern.java:1996)
at java.util.regex.Pattern.compile(Pattern.java:1696)
at java.util.regex.Pattern.<init>(Pattern.java:1351)
at java.util.regex.Pattern.compile(Pattern.java:1028)
at java.util.regex.Pattern.matches(Pattern.java:1133)
at org.dkpro.lab.engine.impl.ImportUtil.matchConstraints(ImportUtil.java:56)
at org.dkpro.lab.storage.filesystem.FileSystemStorageService.getContexts(FileSystemStorageService.java:141)
at org.dkpro.lab.engine.impl.BatchTaskEngine.getLatestExecution(BatchTaskEngine.java:297)
at org.dkpro.lab.engine.impl.BatchTaskEngine.getExistingExecution(BatchTaskEngine.java:360)
at org.dkpro.lab.engine.impl.BatchTaskEngine.executeConfiguration(BatchTaskEngine.java:223)
at org.dkpro.lab.engine.impl.BatchTaskEngine.run(BatchTaskEngine.java:134)
... 4 more
A large part of the whole Lab constraint checking seems to be string based. Any thoughts on how to deal with that?
@reckart I have issus with initializing the conversion service
that I defined in the context.xml.
I am missing somewhere some init step but I don't get where.
When I launch a CrossValidationExperiment
the ExperimentCrossValidation
Task is properly initialised with the ConversionService
being not null
. Once the first InitiTask
runs the ConversionService
is null
so it seems that for the subtasks the initialization is not performed. Where is that suppose to happen
@Horsmann is https://github.com/dkpro/dkpro-tc/issues/353#issuecomment-246180878 still an issue for you?
no, this should be ok now.
At the moment a user have to define 2 dimensions for a reader. Tc builds in the backend a collection reader description from those information.
i.e.
it would be good if a user could just initialise the collection reader and pass it as parameter to TC/Lab. This would make using TC more similar to using Core at least in the sense of specifying the readers.