Closed Erechtheus closed 6 years ago
To make sure we are as well prepared as possible to help during the hackathon sessions could you please add/attach to this issue:
@Erechtheus Could you please send us the input requested by @greenwoodma before the meeting so that we can be prepared for it? Thanks!
Hi! @greenwoodma @pennyl67 @ArneBinder we registered the following component on openminted. The docker container is located in docker hub. An example document in XMI can be found on our github site The XML looks as follows: `<?xml version="1.0" encoding="UTF-8"?>
Hi @Erechtheus I have the following comments as regards the metadata
1) technical issues that must be addressed:
2) some suggestions that would enhance the visibility and discoverability of your component:
Hi @pennyl67
Thank you for the comments. Is it possible to modify the metadata in the openminted test environment? Or do we have to re-register the component?
Thank you
Hi Metadata can be edited only as long as the components are private. For reproducibility reasons once they are published they cannot be changed and have to be registered again.
Hi @Erechtheus I have some comments concerning this part of the metadata:
<ns0:componentDistributionInfo>
<ns0:componentDistributionForm>dockerImage</ns0:componentDistributionForm>
<ns0:distributionLocation>https://hub.docker.com/r/erechtheus/fimda/</ns0:distributionLocation>
<ns0:command>docker run de.dfki.lt.fimda.fimda.FIMDA </ns0:command>
</ns0:componentDistributionInfo>
the distributionLocation element must only contain the image name (erechtheus/fimda). The full url (https://hub.docker.com/r/erechtheus/fimda) you have put is not what OMTD expect. In a general way NAME[:TAG] of an image is the only information required to pull (i.e. docker pull erechtheus/fimda) and run (i.e. docker run erechtheus/fimda...) an image.
the command element must contain the executor of your component. In your case, the executor seems to be "de.dfki.lt.fimda.fimda.FIMDA", right ?
I propose to modify the metadata with something like this
<ns0:componentDistributionInfo>
<ns0:componentDistributionForm>dockerImage</ns0:componentDistributionForm>
<ns0:distributionLocation>erechtheus/fimda</ns0:distributionLocation>
<ns0:command>de.dfki.lt.fimda.fimda.FIMDA </ns0:command>
</ns0:componentDistributionInfo>
Before registering set the value of public to false so that you can edit it if needed.
We re-registered a component at openminted 61a5193e-beab-4b79-9116-b794739260b5. The adapted OMTD-SHARE XML file is as follows
Hi @pennyl67,
thank you for the feedback. I have one remaining question regarding the command element. Perhaps you could have a look at our fimda dockerfile: it contains just the executable jar that has to be called with the parameters --input INPUT_FILENAME
and --output OUTPUT_FILENAME
. It can be run by executing
docker run de.dfki.lt.fimda.fimda.FIMDA --input INPUT_FILENAME --output OUTPUT_FILENAME
Is it correct to set the command element to de.dfki.lt.fimda.fimda.FIMDA
?
How can we test the registered component? Thanks!
Hi
I had a look into the dockerfile.
de.dfki.lt.fimda.fimda.FIMDA is the class in the jar?
I think that you shouldn't use an entrypoint. I have tried it and it didn't work. Probably because Galaxy plays a little bit with the paths before executing the command for the specific tool; i.e. it creates a folder and does everything there
Just add the jar in your image as you already do
ADD target/${JAR_FILE} /usr/share/fimda/fimda.jar
and provide this command
java -jar /usr/share/fimda/fimda.jar
I assume that your jar is executable.
If you do that OpenMinteD platform will execute in a container which will be created from your image java -jar /usr/share/fimda/fimda.jar --input InputFolderSelectedByGalaxy --output outputFolderSelectedByGalaxy.
caution: The outputFolderSelectedByGalaxy should be created by your executable. It is not created by Galaxy or by Docker. See also here https://github.com/openminted/Open-Call-Discussions/issues/28#issuecomment-381187086.
has to be called with the parameters --input INPUT_FILENAME and --output OUTPUT_FILENAME
INPUT_FILENAME and INPUT_FILENAME should be directories.
@galanisd thanks! that helps a lot.
@Erechtheus When you make the changed requested by @galanisd could you send the xml metadata record again for checking? Thanks!
Hi @pennyl67 !
An updated version of the XML metadata can be found here: 61a5193e-beab-4b79-9116-b794739260b5.txt
Thanks!
Thanks @Erechtheus Everything seems fine from the metadata point of view - the only recommended element missing (again it's up to you) is the resourceCreator.
@galanisd thanks! that helps a lot.
@ArneBinder Any updates?
@pennyl67 TO extend the XML metadata. It would be sufficient to add?
`
@galanisd @ArneBinder we modified the docker file according to your comments. We would now like to test our implementation. To this end, we generated a new application. As Input we selected "OMTD Importer". Is it possible to evaluate our application using our own XMI?
Yes. You can upload a corpus that contains your XMI files. You have to be careful and follow the instructions on the side of the upload form so that the corpus is in the correct format. If possible make it public and send it to have a look.
Then you select the corpus and your app and press execute and wait for results.
@Erechtheus for the resourceCreator, that's the template you can use for persons; if you want to add a group or organisation instead of specifying a person, you can also do that; or you can also specify multiple persons.
@Erechtheus when I say "template" I mean that the values Surname, Name, Mail, I assume that they will be instantiated with something meaningful; otherwise, just use the organisation; I can send you the exact details if you prefer.
@Erechtheus
This is your latest docker file? https://github.com/Erechtheus/fimda/blob/master/Dockerfile
the image you are creating is based on "openjdk:8-jre-alpine"?
It seems that "Alpine docker image doesn't have bash installed by default." https://stackoverflow.com/questions/40944479/how-to-use-bash-with-an-alpine-based-docker-image
e.g. I am getting the following when I am executing ls -l with bash...
docker run erechtheus/fimda:0.2.2 /bin/bash -c "ls -l"
container_linux.go:247: starting container process caused "exec: \"/bin/bash\": stat /bin/bash: no such file or directory"
docker: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "exec: \"/bin/bash\": stat /bin/bash: no such file or directory".
ERRO[0002] error getting events from daemon: net/http: request canceled
with sh I get.
docker run erechtheus/fimda:0.2.2 /bin/sh -c "ls -l"
total 52
drwxr-xr-x 2 root root 4096 Jan 9 19:37 bin
drwxr-xr-x 5 root root 340 Apr 25 16:47 dev
drwxr-xr-x 20 root root 4096 Apr 25 16:47 etc
drwxr-xr-x 2 root root 4096 Jan 9 19:37 home
drwxr-xr-x 6 root root 4096 Jan 10 04:52 lib
drwxr-xr-x 5 root root 4096 Jan 9 19:37 media
drwxr-xr-x 2 root root 4096 Jan 9 19:37 mnt
dr-xr-xr-x 151 root root 0 Apr 25 16:47 proc
drwx------ 2 root root 4096 Jan 9 19:37 root
drwxr-xr-x 2 root root 4096 Jan 9 19:37 run
drwxr-xr-x 2 root root 4096 Jan 9 19:37 sbin
drwxr-xr-x 2 root root 4096 Jan 9 19:37 srv
dr-xr-xr-x 13 root root 0 Apr 25 16:47 sys
drwxrwxrwt 2 root root 4096 Jan 9 19:37 tmp
drwxr-xr-x 14 root root 4096 Apr 20 16:48 usr
drwxr-xr-x 12 root root 4096 Jan 9 19:37 var
Galaxy (our workflow engine) creates a .sh script (using bash) which is used to execute each tool (e.g. FIDMA) and contains the command that you provide along with other commands. This script is executed within the container but in your case it seems that it fails. I created a test workflow and had a look into the logs of the container.
I see this
/bin/sh: /srv/galaxy/database/jobs_directory/001/1030/tool_script.sh: not found
Not sure just guessing... that if bash is not installed this might cause issues.
@pennyl67 Yes, I will instantiate the persons with our details. Could you provide me with a template for organisation? Thank you...
@Erechtheus For the creator, you can add right after the end of the "resourceDocumentationInfo", the following
<ns0:resourceCreationInfo>
<ns0:resourceCreators>
<ns0:resourceCreator>
<ns0:actorType>organization</ns0:actorType>
<ns0:relatedOrganization>
<ns0:organizationNames>
<ns0:organizationName lang="en">German Research Center for Artificial Intelligence</ns0:organizationName>
</ns0:organizationNames>
</ns0:relatedOrganization>
</ns0:resourceCreator></ns0:resourceCreators></ns0:resourceCreationInfo>
You can do that at any time you re-register the component after solving the technical problems
@ArneBinder @Erechtheus
I did some more tests. I think that I was right..
Please create an image that includes bash
.
Dear @galanisd we uploaded a new version on docker hub that includes bash
@pennyl67 I adjusted the XML according to your recomendations. Thank you
@Erechtheus Can you send me the latest metadata file? Thanks!
@Erechtheus
image erechtheus/fimda:0.2.3 Problem with bash solved. Now we can call your component. Got this...
Loading regular expressions from Java Archive at location '/resources/mutations.txt'
ERROR -- .MutationFinderThe file containing regular expressions could not be found: /working/src/main/resources/SETH/mutations.txt/mutations.txt
Completed loading of regular expressions: 768 loaded.
java.nio.file.NoSuchFileException: /srv/galaxy/database/jobs_directory/001/1099/working/out/od______1271..f0326a791a327c607e519e256f074178.pdf.xmi
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
at java.nio.file.Files.newOutputStream(Files.java:216)
at java.nio.file.Files.write(Files.java:3292)
at de.dfki.lt.fimda.fimda.FIMDA.annotateXmiToXmi(FIMDA.java:115)
at de.dfki.lt.fimda.fimda.FIMDA.lambda$main$1(FIMDA.java:159)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at de.dfki.lt.fimda.fimda.FIMDA.main(FIMDA.java:157)
java.nio.file.NoSuchFileException: /srv/galaxy/database/jobs_directory/001/1099/working/out/narcis______..5056627fec1fd361ccff7597dbddd9e7.pdf.xmi
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
at java.nio.file.Files.newOutputStream(Files.java:216)
at java.nio.file.Files.write(Files.java:3292)
at de.dfki.lt.fimda.fimda.FIMDA.annotateXmiToXmi(FIMDA.java:115)
at de.dfki.lt.fimda.fimda.FIMDA.lambda$main$1(FIMDA.java:159)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at de.dfki.lt.fimda.fimda.FIMDA.main(FIMDA.java:157)
Exception in thread "main" java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Integer.java:542)
at java.lang.Integer.parseInt(Integer.java:615)
at org.apache.uima.cas.impl.XmiSerializationSharedData.addOutOfTypeSystemElement(XmiSerializationSharedData.java:202)
at org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.addToOutOfTypeSystemData(XmiCasDeserializer.java:2015)
at org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.readFS(XmiCasDeserializer.java:519)
at org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.startElement(XmiCasDeserializer.java:435)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:509)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:374)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at org.apache.uima.cas.impl.XmiCasDeserializer.deserialize(XmiCasDeserializer.java:2313)
at org.apache.uima.cas.impl.XmiCasDeserializer.deserialize(XmiCasDeserializer.java:2252)
at de.dfki.lt.fimda.fimda.FIMDA.casFromXmi(FIMDA.java:91)
at de.dfki.lt.fimda.fimda.FIMDA.annotateXmiToXmi(FIMDA.java:110)
at de.dfki.lt.fimda.fimda.FIMDA.lambda$main$1(FIMDA.java:159)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at de.dfki.lt.fimda.fimda.FIMDA.main(FIMDA.java:157)
ERROR -- .MutationFinderThe file containing regular expressions could not be found: /working/src/main/resources/SETH/mutations.txt/mutations.txt
It seems that a file that is required can not be loaded. Please can you check it?
Not sure.
java.nio.file.NoSuchFileException: /srv/galaxy/database/jobs_directory/001/1099/working/out/od______1271..f0326a791a327c607e519e256f074178.pdf.xmi
--> Also you should check in your code that the output folder exists and create it .... Galaxy is not creating it. See here for more info:
https://github.com/openminted/Open-Call-Discussions/issues/28#issuecomment-381199289
The issue I think is here: https://github.com/rockt/SETH/blob/master/src/main/java/edu/uchsc/ccp/nlp/ei/mutation/MutationFinder.java#L149
Before you send me a updated image please confirm that the jar loads the mutations when it is started (java -jar /usr/share/fimda/fimda.jar
) on any machine. E.g. you can do that by installing the jar on another machine than the one you use for development.
@galanisd many thanks for your investigations! But the error in SETH is not a real issue because it falls back to loading the internally provided mutations.txt
. I'm not a maintainer of SETH (I have just written the wrapper FIMDA) and would stay with the latest SETH release if this error message is not a big deal for the openminted platform.
The newest FIMDA version fixes the output directory bug, but having some credential issues I have to wait for @Erechtheus to publish the new image to docker hub.
@galanisd @Erechtheus the image erechtheus/fimda:0.2.4
should work now.
I did some tests with some XMI files but not sure (yet) that it works as expected. Please can you send me also a couple of input XMI files for testing your component just to be sure. I mean XMI files that you have already tested and you know that everything is OK.
Thanks.
@galanisd @ArneBinder I changed the component to fimda 0.2.4. As @ArneBinder said, the error in SETH is not a real issue and should be seen as a warning.
@galanisd @ArneBinder I changed the component to fimda 0.2.4. As @ArneBinder said, the error in SETH is not a real issue and should be seen as a warning.
For testing your component I have the following workflow... omtdImporter -> PDFReader -> FIMDA
FIMDA step is now configured to use version 0.2.4.
I selected a corpus from test.services.openminted.eu and tried to process it with the FIMDA workflow. Got the following when FIMDA step was executed.
ERROR -- .MutationFinderThe file containing regular expressions could not be found: /working/src/main/resources/SETH/mutations.txt/mutations.txt Loading regular expressions from Java Archive at location '/resources/mutations.txt' Completed loading of regular expressions: 768 loaded. Exception in thread "main" java.lang.NumberFormatException: null at java.lang.Integer.parseInt(Integer.java:542) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.uima.cas.impl.XmiSerializationSharedData.addOutOfTypeSystemElement(XmiSerializationSharedData.java:202) at org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.addToOutOfTypeSystemData(XmiCasDeserializer.java:2015) at org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.readFS(XmiCasDeserializer.java:519) at org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.startElement(XmiCasDeserializer.java:435) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:509) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:374) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at org.apache.uima.cas.impl.XmiCasDeserializer.deserialize(XmiCasDeserializer.java:2313) at org.apache.uima.cas.impl.XmiCasDeserializer.deserialize(XmiCasDeserializer.java:2252) at de.dfki.lt.fimda.fimda.FIMDA.casFromXmi(FIMDA.java:91) at de.dfki.lt.fimda.fimda.FIMDA.annotateXmiToXmi(FIMDA.java:110) at de.dfki.lt.fimda.fimda.FIMDA.lambda$main$1(FIMDA.java:161) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.Iterator.forEachRemaining(Iterator.java:116) at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) at de.dfki.lt.fimda.fimda.FIMDA.main(FIMDA.java:159)
So, yes it seems that the regular expressions are loaded
Completed loading of regular expressions: 768 loaded.
and there is no issue
From the log messages I understand that something happens when reading the XMIs. So this is why I asked for some sample input files.
I am attaching here the output of the PDFReader step so that you can test if fimda works as expected with them.
Looks like the CAS into which the XMI is being deserialized was not initialized with a type system compatible with the XMI file.
Yes exactly. This is what I also believe to be the problem...
For this reason I asked for "tested" XMI input files.
I want to confirm that FIMDA works with "compatible" XMIs.
If this is one of the cases where the component is only interested in plain text from the XMI and not drops any annotations from the input documents, then the problem should be fixable by deserializing the XMIs in lenient mode (e.g. using org.apache.uima.cas.impl.XmiCasDeserializer.getXmiCasHandler(CAS, boolean)
and setting the second parameter to true
).
@galanisd @reckart The fimda tool expects just xmi files in the input folder. It tries to process the typesystem.xml
similar to them and fails. If you remove it from the input, it should work. This folder contains a working input file.
What is the expected behaviour with regard to typesystem.xml
file(s) in the input folder? Is it necessary to handle them or is it possible to merge the annotations later by the omtd platform? If the former is the case, what is the easiest way to do so? At the moment we use this straight forward code to do the deserialization.
Thanks for all the help!
@ArneBinder I would recommend that XMI files use the extension .xmi
and in that way they do not conflict with the typesystem.xml file.
@ArneBinder if a component "retains" incoming annotations, then the typesystem.xml file must be read, processed (i.e. merged with any type system that the component might contribute) and the merged file must be written to the output. E.g. the DKPro Core XmiReader (1.9.1) is capable of doing such a merging. If a component "drops" incoming annotations, then you may be able to ignore the typesystem file (unless your component really needs it). Whether or not a component retains input annotations can be declared in the OMTD-SHARE descriptor (however, I believe there is no OMTD-SHARE Java annotation for this yet).
Here is the relevant code from the DKPro Core XmiReader. A bit more complex than yours, but hopefully still manageable:
@reckart thanks! I will have a look.
@galanisd @reckart The new release 0.2.5 processes only .xmi
files and merges typesystem.xml
if this file exists. What's the way to write a type system (e.g. obtained from aCAS.getTypeSystem()
) back to disc?
EDIT: Nevermind, found it. See release 0.2.6.
Well, basically:
TypeSystemUtil.typeSystem2TypeSystemDescription(aJCas.getTypeSystem()).toXML(typeOS);
EDIT: Nevermind, found it. See release 0.2.6.
Yes this time it has completed. Output contains some fimda annotations and typesystem seems OK. Please have a look
@Erechtheus could you verify that the output sent by @galanisd is what you are expecting?
@gkirtzou @galanisd Yes, the output looks as expected. We also updated the version in openminted. Should we now close the issue?
closed.
Similar to #30 we would like to discuss how we can test the successfull integration of FIMDA docker container into the openminted test environment