Closed ptnplanet closed 13 years ago
analyze_tagger_coverage does not support custom corpus reader arguments yet, though I do plan to add support for it in the future. Until then, you must either specify a reader class that does not need custom arguments, or you can specify a known corpus, like conll2000. I'll leave this issue open until custom argument support is added.
Custom corpus readers are now supported, though I'm sure you've found a way around this by now.
Using the command python train_tagger.py conlltest --fileids ned.train --reader nltk.corpus.reader.conll.ConllChunkCorpusReader
results in a similar error. The directory conlltest is a direct copy of conll2002 (for testing purposes), which is working correctly. Is there an argument missing?
loading conlltest
Traceback (most recent call last):
File "train_tagger.py", line 119, in <module>
tagged_corpus = load_corpus_reader(args.corpus, reader=args.reader, fileids=args.fileids)
File "/path/to/nltk-trainer/nltk_trainer/__init__.py", line 89, in load_corpus_reader
real_corpus = reader_cls(root, fileids, **kwargs)
TypeError: __init__() takes at least 4 arguments (3 given)
@sbrugman Yes, it looks like the ConllChunkCorpusReader requires a chunk_types argument. The simplest fix would be to create a wrapper class you can use as the reader class.
The ConllChunkCorpusReader needs an extra argument, a list of nodetags.
File "analyze_tagger_coverage.py", line 47, in
corpus = reader_cls(args.corpus, '.+')
TypeError: init() takes at least 4 arguments (3 given)