codeaudit / dkpro-core-asl

Automatically exported from code.google.com/p/dkpro-core-asl
0 stars 0 forks source link

Integrate twitter pos tagger model #185

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Since we have CoreNLP 3.2.0 support, it would be create if we could integrate 
this model into the build.xml.

http://gate.ac.uk/wiki/twitter-postagger.html

Original issue reported on code.google.com by richard.eckart on 29 Jul 2013 at 4:52

GoogleCodeExporter commented 9 years ago
Which zip files contains the models? And which are the models?

Inside twitie_pos_pr.zip there are these files:

Tagger_Stanford
├── resources
│   ├── english-fast.41.model
│   ├── english-newswire-fast.model
│   ├── english-tag-map.txt
│   └── english-twitter.model
Tagger_Twitter
├── resources
│   ├── english-fast.41.model
│   ├── english-newswire-fast.model
│   ├── english-tag-map.txt
│   └── english-twitter.model

And within twitie-tagger.zip there are these ones:

twitie-tagger/
├── models
│   ├── english-fast.41.model
│   └── english-twitter.model

Which ones should be integrated into build.xml?

Original comment by pedrobss...@gmail.com on 30 Jul 2013 at 9:13

GoogleCodeExporter commented 9 years ago
Are the models with the same name the same? (try md5 on them and compare the 
checksums).

Original comment by richard.eckart on 30 Jul 2013 at 10:39

GoogleCodeExporter commented 9 years ago
I just checked that. The files with same name have the same checksum indeed.

Original comment by pedrobss...@gmail.com on 30 Jul 2013 at 1:35

GoogleCodeExporter commented 9 years ago
I removed the "newswire" model which is the same as the 
"english-left3words-distsim" model and changed the extensions from "model" to 
"tagger".

We still need test cases for the new models and they have not been added yet to 
the dependency management section in the pom.

This should be done before moving the models from the staging into the model 
repository on zoidberg.

Original comment by richard.eckart on 6 Aug 2013 at 7:26

GoogleCodeExporter commented 9 years ago

Original comment by pedrobss...@gmail.com on 6 Aug 2013 at 6:10

GoogleCodeExporter commented 9 years ago
Reopening because there are no unit tests for the new models.

Original comment by richard.eckart on 10 Aug 2013 at 1:46

GoogleCodeExporter commented 9 years ago

Original comment by pedrobss...@gmail.com on 13 Aug 2013 at 4:35