pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch
https://pytorch.org/text
BSD 3-Clause "New" or "Revised" License
3.49k stars 813 forks source link

ImportError: cannot import name 'Field' from 'torchtext.data' #2183

Open MrMoe830 opened 1 year ago

MrMoe830 commented 1 year ago

❓ Questions and Help

Description I'm using pytorch2.0.0, the version of torchtext is 0.15.2, when I import "Field" and "BucketIterator" in the code(from torchtext.data import Field, BucketIterator), I got an error from this sentence: ImportError: cannot import name 'Field' from ' torchtext.data' (D:\ML_Pytorch\venv\lib\site-packages\torchtext\data\__init__.py)

May I ask where did the Fieldgo? ? If Fielddisappears, is there any other similar functionality that can be imported?

ankitbatra22 commented 1 year ago

It seems like they were deprecated in favour of torch text datasets and vocab classes, which have a much simpler API.

Now you can do something like:

tokenizer = get_tokenizer('spacy', language='en')

# Function to yield list of tokens
def yield_tokens(data_iter: Iterable) -> List[str]:
    for text, _ in data_iter:
        yield tokenizer(text)

# Load dataset
train_iter, val_iter, test_iter = Multi30k()

# Build the vocab. <unk> is special token
vocab = build_vocab_from_iterator(yield_tokens(train_iter), specials=["<unk>"])
vocab.set_default_index(vocab["<unk>"])

# Data as tensors
text_pipeline = lambda x: vocab(tokenizer(x))
label_pipeline = lambda x: int(x) - 1

You should still be able to import Field using from torchtext.legacy.data import Field as well

Kousik-Sasmal commented 1 year ago

@MrMoe830 Field is deprecated. You can check one of my repo for the updated version of torchtext. Link: https://github.com/Kousik-Sasmal/experiment-with-pytorch-torchtext