jeisner / treebank-scripts

Suite of scripts for preprocessing the Penn Treebank, primarily to extract lexical subcategorization frames and dependencies.
MIT License
7 stars 1 forks source link

canonindices should canonically order multiple gaps or knobs #6

Open jeisner opened 8 years ago

jeisner commented 8 years ago

[item from the old TO-DO file dated 2002-04-07]

canonindices really should deal with the problem of multiple gaps or knobs on a single constituent: these should be alphabetized or something, with ties broken by where they are first matched.

jeisner commented 8 years ago

[item from the old TO-DO file dated 2002-04-07]

listframes should give a canonical numbering to the slashed categories.