aminorex / dkpro-core-asl

Automatically exported from code.google.com/p/dkpro-core-asl
0 stars 0 forks source link

Constituent mapping for NEGRA tagset #486

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Constituent mapping for NEGRA tagset.

Original issue reported on code.google.com by richard.eckart on 2 Oct 2014 at 2:41

GoogleCodeExporter commented 9 years ago
This issue was updated by revision r2881.

- Added some more mappings

Original comment by richard.eckart on 4 Oct 2014 at 10:23

GoogleCodeExporter commented 9 years ago
I'll resolve this for the time being. If anybody wants to review the 
(incomplete) mappings before 1.7.0 - please do so. Otherwise, we can have a new 
issue for the next version if desired.

Original comment by richard.eckart on 12 Nov 2014 at 8:42

GoogleCodeExporter commented 9 years ago
is there any documentation about the DKPro Core types in Constituency?
(e.g. where does SBAR, SINV etc come from)
I did not find any, also not in the DKPro Core book

otherwise I would volunteer to review the mapping - had a brief look and it 
seems to be a mapping where a lot of useful information is lost

Original comment by eckle.kohler on 12 Nov 2014 at 9:55

GoogleCodeExporter commented 9 years ago
Afaik the DKPro Core constituent types correspond to the Penn Treebank 
constituent types as produced by the Stanford parser. 

See: http://bulba.sdsu.edu/jeanette/thesis/PennTags.html
See: 
https://dkpro-core-asl.googlecode.com/svn/de.tudarmstadt.ukp.dkpro.core-asl/trun
k/de.tudarmstadt.ukp.dkpro.core.api.syntax-asl/src/main/resources/de/tudarmstadt
/ukp/dkpro/core/api/syntax/tagset/en-ptb-constituency.map

Issue 516 (formerly issue 99) contains some discussion about adopting more 
"universal" tags, but this hasn't proceeded since.

Original comment by richard.eckart on 12 Nov 2014 at 10:08

GoogleCodeExporter commented 9 years ago
this is broken:  http://bulba.sdsu.edu/jeanette/thesis/PennTags.html

used instead:
http://www.sfs.uni-tuebingen.de/~dm/07/autumn/795.10/ptb-annotation-guide/root.h
tml

Original comment by eckle.kohler on 12 Nov 2014 at 10:26

GoogleCodeExporter commented 9 years ago
better overview, less detail:
http://www.surdeanu.info/mihai/teaching/ista555-fall13/readings/PennTreebankCons
tituents.html

Original comment by eckle.kohler on 12 Nov 2014 at 10:28

GoogleCodeExporter commented 9 years ago
I reviewd the mapping.
The mapping is designed to loose information - which is due to incompatible 
tagsets (Penn tagset not making distinctions that are present in Negra)

I think, Issue 516 (formerly issue 99) is not related because it is about 
dependency types - this issue here is about constituent types.

Original comment by eckle.kohler on 12 Nov 2014 at 1:21

GoogleCodeExporter commented 9 years ago
Issue 517 is about dependencies.
Issue 516 is about constituents.

Although I might have made a bad job at separating the comments from issue 99 
into the two new issues - there is also some overlap.

Original comment by richard.eckart on 12 Nov 2014 at 2:20

GoogleCodeExporter commented 9 years ago
This issue was updated by revision r3008.

- Added/changed 5 mappings based on expert-feedback from JEK

Original comment by richard.eckart on 12 Nov 2014 at 2:42

GoogleCodeExporter commented 9 years ago
I've just created a 1.7.x branch. Should r3008 be merged into it?

Original comment by pedrobss...@gmail.com on 12 Nov 2014 at 2:48

GoogleCodeExporter commented 9 years ago
Yes, please ;)

Original comment by richard.eckart on 12 Nov 2014 at 4:42

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 12 Nov 2014 at 4:52

GoogleCodeExporter commented 9 years ago

Original comment by pedrobss...@gmail.com on 12 Nov 2014 at 4:57