craston / MARS

MARS: Motion-Augmented RGB Stream for Action Recognition
MIT License
161 stars 45 forks source link

UCF101 dataset train error #6

Open Fazlik995 opened 5 years ago

Fazlik995 commented 5 years ago

Sorry for disturbing

But i got another problem

HMDB51 dataset worked fine, to train UCF101 dataset I just changed train part: print("Preprocessing train data ...") train_data = globals()['{}_test'.format(opt.dataset)](split = opt.split, train = 0, opt = opt) --> train 1 to 0

It seems everything is fine, however i got error below:

Is it related with code or I made mistake

Preprocessing train data ... Length of train data = 3678 Preprocessing validation data ... Length of validation data = 3678 Preparing datatloaders ... Length of train datatloader = 114 Length of validation datatloader = 114 Loading model... resnext 101 loading pretrained model trained_models/kinetics/RGB_Kinetics_16f.pth Layers to finetune : ['layer4', 'fc'] Initializing the optimizer ... lr = 0.001 momentum = 0.9 dampening = 0.9 weight_decay = 1e-05, nesterov = False LR patience = 10 run Traceback (most recent call last): File "train.py", line 119, in for i, (inputs, targets) in enumerate(train_dataloader): File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 582, in next return self._process_next_batch(batch) File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch raise batch.exc_type(batch.exc_msg) ValueError: Traceback (most recent call last): File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/_utils/worker.py", line 99, in samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/fazlik/Desktop/MARS/dataset/dataset.py", line 288, in getitem clip = get_train_video(self.opt, frame_path, Total_frames) File "/home/fazlik/Desktop/MARS/dataset/dataset.py", line 96, in get_train_video start_frame = np.random.randint(0, Total_frames) File "mtrand.pyx", line 992, in mtrand.RandomState.randint ValueError: Range cannot be empty (low >= high) unless no samples are taken

craston commented 5 years ago

Hi,

Could you check the values on total frames? The ValueError suggests that the Total_frames could be zero.

If you set train= 0, the UCF101_test class returns a the entire video here. That is the getitem() function will return a batch containing entire videos. Since the code is not designed to train on batches of the entire videos, I would not suggest doing it. Doing this will require changing the changing collate_fn of the datatloader and also modifying the training scripts

Fazlik995 commented 5 years ago

Thank you for fast replying

Now, I understand difference between train=0 and train=1

However, when i use train=1, i got this error

frame_path contain all images

Length of train data = 0 Preprocessing validation data ... Length of validation data = 3678 Preparing datatloaders ... Traceback (most recent call last): File "train.py", line 42, in train_dataloader = DataLoader(train_data, batch_size = opt.batch_size, shuffle=True, num_workers = opt.n_workers, pin_memory = True, drop_last=True) File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 176, in init sampler = RandomSampler(dataset) File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 66, in init "value, but got num_samples={}".format(self.num_samples)) ValueError: num_samples should be a positive integer value, but got num_samples=0

craston commented 5 years ago

Since the length of train data = 0, there's as issue with your train path. Please check if self.data is not empty here

Fazlik995 commented 5 years ago

craston,

in part you mentioned, in case of HMDB51 dataset data returns test/train set

but, in case of UCF101 dataset data returns only test

Could you please explain why?

I think that is the reason why i am getting train data = 0

Did someone can train UCF101 dataset?

HMDB51: def len(self): ''' returns number of test/train set ''' return len(self.data)

UCF101:

def len(self): ''' returns number of test set ''' return len(self.data)

Fazlik995 commented 5 years ago

Hi,

In dataset.py file in line 255 if self.train_valtest==1, i think you should change train_valtest == 1 into train_val_test == 1.

craston commented 5 years ago

Hi, thanks for spotting the bug. Did it work properly after changing train_valtest to train_val_test? If yes, I will commit the change. ( I currently do not have access to my datasets since I moved companies so its a little difficult to test for bugs)

Fazlik995 commented 5 years ago

It is not working even after changing to train_val_test == 1. In my case compiler shows Length of train data = 0. So, I still can not solve the issue

craston commented 5 years ago

Please could you print the value of split_lab_filenames . Also could you just add the line print(os.path.join(self.opt.frame_dir, line.strip('\n')[:-4])) below line https://github.com/craston/MARS/blob/ae2749dc37ec19821da69654e9197a52f0b45839/dataset/dataset.py#L264

And check if this path actually exists

aia39 commented 4 years ago

@Fazlik995 have you fixed the problem you have faced? If yes then can you please tell me which part to change

JianhaoZhan commented 4 years ago

delete the label and space in UCF-101 train list and it works