Closed iamyihwa closed 4 years ago
Did you change any environment or code besides encoding='utf8'
?
If not, could you share your environment?
I am having trouble reproducing this error.
Hi, I didn't change other things other than that. Yesterday I have tried to replace super(IMDB, self) to super() (in two places, init and splits) and it seems to work. Is this the right way to do? I have no idea..
I used anaconda3 and python 3.6. I actually used torchtext under environment called fastai (it is a deep learning courseware and they have their own tools). However the place where error arises, it is a bit independent from the rest of the codes..
@iamyihwa Are you running the code in a Jupyter notebook and have not restarted the kernel? If so, there's a chance that your kernel is referencing the wrong IMDB dataset class when super(IMDB, self) is being called, causing an error.
I am experiencing exact same errors. From a FastAi notebook a cell executes: splits = torchtext.datasets.IMDB.splits(TEXT, IMDB_LABEL, 'data/') This caused the encoding issue at line 32 of the imdb.py which was:
with open(fname, 'r') as f:
I changed this line to
with open(fname, ''r', encoding="utf-8") as f:
Thereafter I got same error that @iamyihwa got
TypeError: super(type, obj): obj must be an instance or subtype of type
I confirm that I had restarted the kernel.
Are we doing something improper by invoking a class method like: splits = torchtext.datasets.IMDB.splits(TEXT, IMDB_LABEL, 'data/') ????
@iamyihwa and all others who may read this.
To fix this error, clone this repository i.e. https://github.com/pytorch/text
and install torchtext from here (python setup.py install --force)
It has updates that release 2.0.1 does not cover. (updates about encoding are not limited to imdb.py but involve dataset.py, field.py etc.)
I can can confirm that after this install all my errors encoding as well as TypeError: super(type, obj): obj must be an instance or subtype of type went away
I cloned the repository (master branch), forced install, restarted the kernel, but I still have the same errors. Changing the encoding got me to the super() error.
How do I guarantee that I'm running the updated version.
I'm in a Windows 10 machine. Also a Jupyter Notebook
UPDATE: I've just managed to fix it. It was necessary to uninstall the older version with a pip uninstall torchtext
. The older version as installed directly in site-packages and was taking precedence to the newer one installed with python setup.py install
. SOLVED!
Feel free to re-open the issue if you still have question.
Hello, when loading the imdb dataset, since i am using python 3, have replaced the open to open(file, encoding = 'utf8'). However after that this error arises. I have no idea on how to solve this issue.
TypeError Traceback (most recent call last)