check for other things [in the Treebank] that look like bugs. For example, I saved the following snippet that detects indexed traces that aren't bound.
[1/4/98. Find all the traces in Treebank II, their nonterminal
categories, and the nonterminal categories they end up as. See
~/tmp/move-categories and ~/tmp/move-categories-summary.]
~/hw/learn/02-subcat-study/extract/oneline -n ~/info/wsj/*/* | perl5 -e '$token = "[^ \t\n()]+"; $ind = "-[0-9]+\\b"; $tokennoind = "(?:(?!$ind)[^ \t\n()])+"; while (<>) { s/^(\S+:[0-9]+:\t)?//, $location = $&; while (/\(($tokennoind)(?:$ind)? \(-NONE- ($tokennoind)($ind)/og) { print "$location$2 $1 "; if (/\(($tokennoind)$3 /) { print "$1\n" } else {print "not_found\n" }}}' | sort -k 2 | uniq -f 1 -c
[item from the old TO-DO file dated 2002-04-07]
check for other things [in the Treebank] that look like bugs. For example, I saved the following snippet that detects indexed traces that aren't bound.