pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch
https://pytorch.org/text
BSD 3-Clause "New" or "Revised" License
3.49k stars 813 forks source link

'NoneType' object has no attribute 'Lock' #2172

Open keremnymn opened 1 year ago

keremnymn commented 1 year ago

🐛 Bug

Describe the bug A clear and concise description of what the bug is.

Looping through the data that was split gives AttributeError: 'NoneType' object has no attribute 'Lock' This exception is thrown by __iter__ of _MemoryCellIterDataPipe(remember_elements=1000, source_datapipe=_ChildDataPipe)

To Reproduce Steps to reproduce the behavior:

You can try to execute the function from the official documentation, which is on this page:

# import datasets
from torchtext.datasets import IMDB

train_iter = IMDB(split='train')

def tokenize(label, line):
    return line.split()

tokens = []
for label, line in train_iter:
    tokens += tokenize(label, line)

I tried both on my local environment and Google Colab. Same error.

zhangdanq commented 1 year ago

Hi, I have encountered the same problem as yours, did you solve it now?

mnjkhtri commented 1 year ago

Updates?

michaeldengxyz commented 10 months ago

torch.version = 2.1.0+cpu torchtext.version = 0.16.0+cpu sys.version= 3.10.9 | packaged by Anaconda, Inc. | (main, Mar 8 2023, 10:42:25) [MSC v.1916 64 bit (AMD64)]

I got the same problem when I run my codes as:

train, test = IMDB(split=('train', 'test')) counter = Counter() for (label, line) in train: print(label, line) counter.update(tokenizer(line)) vocab = Vocab(counter, min_freq=10, specials=('', '', '', ''))

then I change my codes as:

imdb_root = r'T:\DeepLearningwithPyTorch_Code\DLwithPyTorch-master\Chapter06\aclImdb' train, test = IMDB(root=imdb_root,split=('train', 'test')) counter = Counter() for (label, line) in train: print(label, line) counter.update(tokenizer(line)) vocab = Vocab(counter, min_freq=10, specials=('', '', '', ''))

now no such error: 'NoneType' object has no attribute 'Lock'

cbowdon commented 9 months ago

I resolved this in torchtext 0.16.0 by installing portalocker==2.8.2 and then restarting the kernel of my Jupyter notebook.

IMO portalocker should be an explicity dependency, as in #2182.

gautamvarmadatla commented 5 months ago

I just tried using this tutorial https://pytorch.org/text/stable/tutorials/sst2_classification_non_distributed.html#data-transformation

I am still getting the same error when trying to iterate over dataloader ,

AttributeError Traceback (most recent call last) in <cell line: 1>() ----> 1 for i in train_dataloader: 2 print(i)

75 frames /usr/local/lib/python3.10/dist-packages/torchdata/datapipes/iter/util/cacheholder.py in _cache_check_fn(data, filepath_fn, hash_dict, hash_type, extra_check_fn, cache_uuid) 261 os.makedirs(dirname) 262 --> 263 with portalocker.Lock(promise_filepath, "a+", flags=portalocker.LockFlags.EXCLUSIVE) as promise_fh: 264 promise_fh.seek(0) 265 data = promise_fh.read()

AttributeError: 'NoneType' object has no attribute 'Lock' This exception is thrown by iter of _MemoryCellIterDataPipe(remember_elements=1000, source_datapipe=_ChildDataPipe)

Tried restarting kernel did'nt work!