Chapt 5 Section 2.5 Verbs section correction

There is an error in the NLTK Book updated for Python 3 and NLTK 3, Natural Language Processing with Python; Chapter 5. Categorizing and Tagging Words; Section 2.5 Verbs:

"To clarify the distinction between VBD (past tense) and VBN (past participle), let's find words which can be both VBD and VBN, and see some surrounding text:

[w for w in cfd1.conditions() if 'VBD' in cfd1[w] and 'VBN' in cfd1[w]] ['Asked', 'accelerated', 'accepted', 'accused', 'acquired', 'added', 'adopted', ...]"

The generator/comprehension bracketed above does not produce any result because cfd1 must be regenerated with the standard tagset (rather than the previously assigned universal tagset) of the treebank.tagged_words() corpus. Insert the following line prior to the bracketed line:

cfd1 = nltk.ConditionalFreqDist(wsj)

The corpus variable wsj was reassigned to the standard tagset just prior to this example so only this additional line is required to rebuild the conditional frequency distribution with the standard tagset so the events 'VBD' and 'VBN' can be found in the distribution (instead of merely 'VERB').

A minor additional detail is that the example result will not be alphabetic order (as shown in the book text) unless the bracketed comprehension is wrapped in the sorted() function.

nltk / nltk_book

Chapt 5 Section 2.5 Verbs section correction #191