Closed phosseini closed 2 years ago
Apparently, this error was caused by the torch
version incompatibility. I installed torch==1.2.0
and this error is resolved.
This may be because the name of the keyword parameter cls
in the block_cls
class and the formal parameter name cls
of the __new__
magic method are duplicated, you can delete the keyword parameter statement in the block_cls
from line 170-173 just like this:
with data_utils.numpy_seed(self.seed + k):
loaded_datasets.append(
block_cls(
tokens,
ds.sizes,
self.args.tokens_per_sample,
self.dictionary.pad(),
self.dictionary.cls(),
self.dictionary.mask(),
self.dictionary.sep(),
break_mode=self.args.break_mode,
short_seq_prob=self.short_seq_prob,
tag_map=tag_map
))
I'm trying to run the pretraining code and I get the following error:
The command I'm running (for now I intentionally changed the
distributed_port
anddistributed_world_size
to avoid the distributed or multiprocessing mode):I've successfully preprocessed and tokenized the corpus, so I have no problem with that. I have the following files in the
../data/destination/
:dict.txt
train.bin
train.idx
valid.bin
valid.idx
Even though I know what possibly the
TypeError: __new__() got multiple values for argument
could mean, I have no idea what causes such an error here. Any insight or cue is appreciated!@mandarjoshi90 P.S. I know that CPU is not supported, but just to test the rest of the code, I temporarily removed the check for availability of the CPU and I did not get the error I mentioned above and could load the data successfully. So I wonder if this error I'm getting has anything to do with running the code on GPU or any GPU-related packages?