Closed niklasben closed 7 years ago
TypeError: str() takes at most 1 argument (2 given)
After a Google search, this error is more likely to point to an encoding problem.
Can you give me the relevant code lines with the execution of treetagger and pprint or your whole file tt_testfile.py? Then I can check if it works for me. Which Python version do you use?
My test file looks like this
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import pprint
from treetagger import TreeTagger
tt = TreeTagger(language='english')
print(tt.tag('What is the airspeed of an unladen swallow?'))
pprint.pprint(tt.tag('What is the airspeed of an unladen swallow?'))
Basically the same as yours
# -*- coding: utf-8 -*-
from treetagger import TreeTagger
from pprint import pprint
tt = TreeTagger(language='english')
pprint(tt.tag('What is the airspeed of an unladen swallow?'))
I am using Python 2.7.6.
Ok I could understand this error. In Python 2, there were major problems with the conversion between Unicode, ASCII, and UTF-8.
I made a small change to the current code. If you do not use umlauts as in German, then it could also work with Python 2. Otherwise, there is a UnicodeDecodeError under Python 2.
#(stdout, stderr) = p.communicate(bytes(_input, 'UTF-8'))
(stdout, stderr) = p.communicate(str(_input).encode('utf-8'))
You can also view the old treetagger version for Python 2 treetagger_python2.py
I would say you should try it with Python 3.
The changed line of code doesn't work for me, I am still getting the same Error Message.
With Python3 I am getting TreeTagger parameter file invalid: german-utf8.par
same with the English Version. Anyway, I will try to get it to work.
Sorry it took so long, I tried to reproduce this on another client today.
Long story short: I wasn't able to reproduce the error on this machine. So I'd say it has to do with some configuration problems on the other client, therefore I will close the issue.
I am going to write a bigger Testscript if I have the time. Until then this is the short script and output from the terminal:
>>> import pprint
>>> import treetaggerwrapper
>>> tagger = treetaggerwrapper.TreeTagger(TAGLANG='de')
>>> tags = tagger.tag_text(u"Dies ist ein kurzer Satz zum Testen.")
>>> pprint.pprint(tags)
[u'Dies\tPDS\tdies',
u'ist\tVAFIN\tsein',
u'ein\tART\teine',
u'kurzer\tADJA\tkurz',
u'Satz\tNN\tSatz',
u'zum\tAPPRART\tzu',
u'Testen\tNN\tTesten',
u'.\t$.\t.']
When executing I am getting the following Error Message.
The Error points to
(stdout, stderr) = p.communicate(bytes(_input, 'UTF-8'))
.