Closed bobd988 closed 1 year ago
I suspect You are using a very old version of pytorch.
As the error message has said, drop_last is just a parameter of dataloader. If you don't want to upgrade the version of torch, you can simply remove this parameter.
Hi,
I was training with comma2k19 with two A6000 GPU cards in a PC with CUDA 11.5, Ubuntu 20.04, with two terminals running each
PORT=23345 SLURM_PROCID=0 SLURM_NTASKS=2 python main.py PORT=23346 SLURM_PROCID=1 SLURM_NTASKS=2 python main.py
I got below error from the first terminal after started. I also tried with one GPU card but it also gave same error. How can I solve this? Thanks.
[1676912307.07] starting job... 0 of 2 [1676912608.11] DDP Initialized at localhost:23345 0 of 2 2023-02-20 09:03:28.404838: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0 Comma2k19SequenceDataset: DEMO mode is on. Traceback (most recent call last): File "main.py", line 246, in
main(rank=int(os.environ['SLURM_PROCID']), world_size=int(os.environ['SLURM_NTASKS']), args=args)
File "main.py", line 119, in main
train_dataloader, val_dataloader = get_dataloader(rank, world_size, args.batch_size, False, args.n_workers)
File "main.py", line 69, in get_dataloader
train_sampler = DistributedSampler(train, **dist_sampler_params)
TypeError: init() got an unexpected keyword argument 'drop_last'