Upgrade to torchtext 0.3 - Githubissues

OpenNMT / OpenNMT-py

Open Source Neural Machine Translation and (Large) Language Models in PyTorch

https://opennmt.net/

MIT License

6.7k stars 2.25k forks source link

Upgrade to torchtext 0.3 #767

Closed yukunfeng closed 6 years ago

yukunfeng commented 6 years ago

pytorch: 0.4 torchtext: 0.3

Hello, I've got one error when doing the first preprocessing step from here.

Traceback (most recent call last):
  File "preprocess.py", line 204, in <module>
    main()
  File "preprocess.py", line 191, in main
    fields = onmt.io.get_fields(opt.data_type, src_nfeats, tgt_nfeats)
  File "/home/lr/yukun/OpenNMT-py/onmt/io/IO.py", line 44, in get_fields
    return TextDataset.get_fields(n_src_features, n_tgt_features)
  File "/home/lr/yukun/OpenNMT-py/onmt/io/TextDataset.py", line 229, in get_fields
    postprocessing=make_src, sequential=False)
TypeError: __init__() got an unexpected keyword argument 'tensor_type'

It seems that torchtext 0.3 has changed their interface parameter in torchtext.data.Field()

But in torchtext 0.2.3, it works well except following warning:

xxx/torchtext/data/field.py:321: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.

And it seems new version of torchtext has fixed this from here. I am new to pytorch and opennmt. I don't know whether this warning matter?

srush commented 6 years ago

For now we recommned torchtext 0.2.1 . We will try to support the new version.

vince62s commented 6 years ago

Hi guys, I went into this issue too. There is not much to adjust but there are a few breaking changes that would require a pip install git+git://github.com/pytorch/text installation. I will add these changes in the #762 since I did it already. But in the end, yes for pytorch 0.4 full compatibilty, it requires torchtext 0.3

yukunfeng commented 6 years ago

Oh, Thanks both!

ciphurus commented 5 years ago

So the solution here is to use torchtext 0.2.1?

vince62s commented 5 years ago

no you need >=0.3.0

ciphurus commented 5 years ago

I get this error with torchtext = 0.3.1 and torch = 1.0.0.dev20181014. Am I doing something wrong?

Actually, I am sorry. I get this error with LABEL = Field(sequential=False, use_vocab=False,tensor_type=torch.FloatTensor) Different than the issue here. I am guessing tensor_type is not an argument that is accepted any more. What's the best way to change the code correctly? Thanks for the help and I know this is probably not related to the issue here

vince62s commented 5 years ago

yes it's dtype now but don't do this, open a new issue when you face this, otherwise it's a waste of time ofr you and us. thanks.

ritchieng commented 5 years ago

@ciphurus Just use this format of dtype=torch.long or dtype=torch.float.

text = data.Field(dtype=torch.long)
label = data.LabelField(dtype=torch.float)

sahinurlaskar commented 5 years ago

I got an error in preprocessing step. plz help. Extracting features...

number of source features: 0.
number of target features: 0. Building Fields object... Traceback (most recent call last): File "preprocess.py", line 193, in main() File "preprocess.py", line 180, in main fields = onmt.io.get_fields(opt.data_type, src_nfeats, tgt_nfeats) File "/content/OpenNMT-py/onmt/io/IO.py", line 43, in get_fields return TextDataset.get_fields(n_src_features, n_tgt_features) File "/content/OpenNMT-py/onmt/io/TextDataset.py", line 231, in get_fields postprocessing=make_src, sequential=False) TypeError: init() got an unexpected keyword argument 'tensor_type'

vince62s commented 5 years ago

/content/OpenNMT-py/onmt/io You are on a very very old version of onmt-py

sahinurlaskar commented 5 years ago

ok now m facing this issue while preprocessing in my local machine. _pickle.PicklingError: Can't pickle <function Field. at 0x7f0d1f050e18>: attribute lookup Field. on torchtext.data.field failed

sahinurlaskar commented 5 years ago

solved my above issues by installing all requirements packages.

sahinurlaskar commented 5 years ago

Thanks for advice

sahinurlaskar commented 5 years ago

how to apply subword-nmt on tokenized file as mentioned in preprocessing step https://github.com/iacercalixto/MultimodalNMT

sahinurlaskar commented 5 years ago

how can i fix this error? RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/torch/csrc/cuda/Module.cpp:33

sahinurlaskar commented 5 years ago

Extracting features...

number of source features: 0.
number of target features: 0. Building Fields object... Traceback (most recent call last): File "preprocess.py", line 193, in main() File "preprocess.py", line 180, in main fields = onmt.io.get_fields(opt.data_type, src_nfeats, tgt_nfeats) File "/content/MultimodalNMT/onmt/io/IO.py", line 43, in get_fields return TextDataset.get_fields(n_src_features, n_tgt_features) File "/content/MultimodalNMT/onmt/io/TextDataset.py", line 218, in get_fields postprocessing=make_src, sequential=False) TypeError: init() got an unexpected keyword argument 'tensor_type'

sahinurlaskar commented 5 years ago

Using cached https://files.pythonhosted.org/packages/ee/67/f403d4ae6e9cd74b546ee88cccdb29b8415a9c1b3d80aebeb20c9ea91d96/pytorch-1.0.2.tar.gz Building wheels for collected packages: pytorch Building wheel for pytorch (setup.py) ... error ERROR: Failed building wheel for pytorch Running setup.py clean for pytorch Failed to build pytorch Installing collected packages: pytorch Running setup.py install for pytorch ... error ERROR: Command "/usr/bin/python3 -u -c 'import setuptools, tokenize;file='"'"'/tmp/pip-install-kuybhq1y/pytorch/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-seag18hw/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-kuybhq1y/pytorch/

sahinurlaskar commented 5 years ago

Traceback (most recent call last): File "preprocess.py", line 11, in import onmt.io File "/content/MultimodalNMT/onmt/init.py", line 1, in import onmt.io File "/content/MultimodalNMT/onmt/io/init.py", line 1, in from onmt.io.IO import collect_feature_vocabs, make_features, \ File "/content/MultimodalNMT/onmt/io/IO.py", line 7, in import torchtext.data File "/usr/local/lib/python3.6/dist-packages/torchtext/init.py", line 1, in from . import data File "/usr/local/lib/python3.6/dist-packages/torchtext/data/init.py", line 4, in from .field import RawField, Field, ReversibleField, SubwordField, NestedField, LabelField File "/usr/local/lib/python3.6/dist-packages/torchtext/data/field.py", line 61, in class Field(RawField): File "/usr/local/lib/python3.6/dist-packages/torchtext/data/field.py", line 118, in Field torch.float32: float, AttributeError: module 'torch' has no attribute 'float32'

sahinurlaskar commented 5 years ago

Extracting features...

number of source features: 0.
number of target features: 0. Building Fields object... Building & saving training data...
saving train data shard to data/m30k.train.1.pt. Building & saving vocabulary...
reloading data/m30k.train.1.pt.
tgt vocab size: 4.
src vocab size: 2. Building & saving validation data...
saving valid data shard to data/m30k.valid.1.pt.

sahinurlaskar commented 5 years ago

Loading train dataset from data/m30k.train.1.pt, number of examples: 0 data_type: text Traceback (most recent call last): File "train_mm.py", line 448, in main() File "train_mm.py", line 428, in main fields = load_fields(first_dataset, data_type, checkpoint) File "train_mm.py", line 329, in load_fields fields = dict([(k, f) for (k, f) in fields.items() File "train_mm.py", line 330, in if k in dataset.examples[0].dict]) IndexError: list index out of range

sahinurlaskar commented 5 years ago

Can any one help after after preprocessing step while applying training command in multimodal NMT step i found this error: In processing step, after tokenization i applied BPE (https://nlp.h-its.org/bpemb/#download) then apply presprocssing command as given in Step1.

Loading train dataset from data.train.1.pt, number of examples: 0 data_type: text Traceback (most recent call last): File "train_mm.py", line 448, in main() File "train_mm.py", line 428, in main fields = load_fields(first_dataset, data_type, checkpoint) File "train_mm.py", line 329, in load_fields fields = dict([(k, f) for (k, f) in fields.items() File "train_mm.py", line 330, in if k in dataset.examples[0].dict]) IndexError: list index out of range

sahinurlaskar commented 5 years ago

Traceback (most recent call last): File "extract_image_features.py", line 11, in from onmt.PretrainedCNNModels import PretrainedCNN File "/content/MultimodalNMT/onmt/PretrainedCNNModels.py", line 8, in import pretrainedmodels File "/usr/local/lib/python3.6/dist-packages/pretrainedmodels/init.py", line 3, in from . import models File "/usr/local/lib/python3.6/dist-packages/pretrainedmodels/models/init.py", line 19, in from .torchvision_models import alexnet File "/usr/local/lib/python3.6/dist-packages/pretrainedmodels/models/torchvision_models.py", line 3, in import torchvision.models as models File "/usr/local/lib/python3.6/dist-packages/torchvision/init.py", line 1, in from torchvision import models File "/usr/local/lib/python3.6/dist-packages/torchvision/models/init.py", line 11, in from . import detection File "/usr/local/lib/python3.6/dist-packages/torchvision/models/detection/init.py", line 1, in from .faster_rcnn import * File "/usr/local/lib/python3.6/dist-packages/torchvision/models/detection/faster_rcnn.py", line 7, in from torchvision.ops import misc as misc_nn_ops File "/usr/local/lib/python3.6/dist-packages/torchvision/ops/init.py", line 1, in from .boxes import nms, box_iou File "/usr/local/lib/python3.6/dist-packages/torchvision/ops/boxes.py", line 2, in from torchvision import _C ImportError: libcudart.so.9.0: cannot open shared object file: No such file or directory

sahinurlaskar commented 5 years ago

How i solve above issue ImportError: libcudart.so.9.0: cannot open shared object file: No such file or directory

sahinurlaskar commented 5 years ago

IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number (while training process this error come)

sahinurlaskar commented 5 years ago

IndexError: index 28931 is out of bounds for axis 0 with size 28929

Eurus-Holmes commented 4 years ago

Extracting features...

number of source features: 0.

number of target features: 0. Building Fields object... Traceback (most recent call last): File "preprocess.py", line 193, in main() File "preprocess.py", line 180, in main fields = onmt.io.get_fields(opt.data_type, src_nfeats, tgt_nfeats) File "/content/MultimodalNMT/onmt/io/IO.py", line 43, in get_fields return TextDataset.get_fields(n_src_features, n_tgt_features) File "/content/MultimodalNMT/onmt/io/TextDataset.py", line 218, in get_fields postprocessing=make_src, sequential=False) TypeError: init() got an unexpected keyword argument 'tensor_type'

same error, have you solved it?

Eurus-Holmes commented 4 years ago

Extracting features...

number of source features: 0.

number of target features: 0. Building Fields object... Traceback (most recent call last): File "preprocess.py", line 193, in main() File "preprocess.py", line 180, in main fields = onmt.io.get_fields(opt.data_type, src_nfeats, tgt_nfeats) File "/content/MultimodalNMT/onmt/io/IO.py", line 43, in get_fields return TextDataset.get_fields(n_src_features, n_tgt_features) File "/content/MultimodalNMT/onmt/io/TextDataset.py", line 218, in get_fields postprocessing=make_src, sequential=False) TypeError: init() got an unexpected keyword argument 'tensor_type'

same error, have you solved it?

Fixed!

pip uninstall torchtext
pip install torchtext==0.2.3