Closed andreahorbach closed 1 year ago
The way is normally works is that the person/group creating the model puts it up somewhere on the web (their website, GitHub, some language resource repository, etc.). Then we have a set of scripts in DKPro Core which download such models, package them up as a JAR, and then deploy them to our Maven repository.
So the first step would be that you put the model up somewhere. Please be confident that you are legally allowed to share the model.
Then, you or we could extend the OpenNLP model packaging script to include your model:
https://github.com/dkpro/dkpro-core/blob/master/dkpro-core-opennlp-asl/src/scripts/build.xml
Then we'd use the script to build the model JAR and to upload it to our Maven repo.
Finally, the DKPro Core OpenNLP Maven module pom.xml file would be extended to include the new model.
Thank you for explaining the process.
We uploaded the model at https://github.com/ltl-ude/opennlpChunkerGerman/blob/master/de-chunker-opennlp.bin
It would be great if you could take care of including it into the build.xml, but if not we can of course have a look at doing that ourselves.
@aggarwalpiush would you like to try this?
There is some documentation on implementing/extending the model building scripts here.
The model script for OpenNLP models is here.
You couldn't deploy it to the UKP Maven server at the moment though. Either I'd need to do that or we need to give you proper permissions.
@reckart Before extending the model script, I was trying to run existing OpenNLP asl build.xml, but I got build issue at line 675
I found that ixa pos model is not available at the url provided in the script. I think, we also need to fix this issue.
@aggarwalpiush good idea :)
the build.xml is fixed and PR-1459 is raised. If it looks good, kindly merge it and build the new models JARs at UKP Maven server.
As JARs are available at the server, I'll add them to OpenNLP pom.
We trained a German chunker model for OpenNLP that we would like to contribute. The model was trained on the TIGER corpus using their annotations with a number of systematic modification to match the annotations to TreeTagger chunks. In a 10-fold cross-validation experiment as provided by the OpenNLP model trainer we reach an average F-Score of 96%. If such a model would be of interest for dkpro, could you guide us how to proceed in providing the model?