nltk / nltk_data

NLTK Data
1.45k stars 1.04k forks source link

Fix the class for average perceptron tagger list #211

Closed alvations closed 3 months ago

alvations commented 3 months ago

There's a mismatch between set(self.classes) and list(self.classes) and repr(self.classes).

It seems crazy but the pickle.load() chose the repr(self.classes) as the order to save the serialized set. And the only way to json serialize this properly is to load the old pkl file, set classes = repr(classes) before json dump.

Now the doctest on the nltk perceptron tagger in the main repo passes after this revision.

alvations commented 3 months ago

Merging this so that the final end-to-end testing of the main nltk repo works.