RuntimeError: stack expects a non-empty TensorList

LinixLinux commented 2 years ago

Hi,

First of all I wanted to compliment this great work, it's very impressive. 👍

I am having a issue when trying to evaluate using the pre-trained model.

Am I missing any directories or files here that could cause this error, or does any code need to be adjusted? Thanks.

python evaluate.py --lr_dir=/content/NeuriCam/lrvideo --key_dir=/content/NeuriCam/key --target_dir=/content/NeuriCam/hrvideo  --model_dir=/content/NeuriCam/experiments/bix4_keyvsrc_attn --restore_file=pretrained --file_fmt=%06d.png
Creating the dataset...
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  cpuset_checked))
- done.
load checkpoint from local path: /content/NeuriCam/model/keyvsrc/spynet_20210409-c6c1bd09.pth
Evaluating keyvsrc
Starting evaluation
  0% 0/756 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "evaluate.py", line 182, in <module>
    args.output_dir, args.file_fmt, args.profile)
  File "evaluate.py", line 71, in evaluate
    for i, (train_batch, target, sample_ids) in enumerate(tqdm(dataloader)):
  File "/usr/local/lib/python3.7/dist-packages/tqdm/std.py", line 1195, in __iter__
    for obj in iterable:
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 652, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1347, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1373, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 461, in reraise
    raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/content/NeuriCam/model/dataset.py", line 150, in __getitem__
    self.frame_fmt, start_num, end_num, grayscale=self.grayscale)
  File "/content/NeuriCam/model/dataset.py", line 77, in load_video
    video = torch.stack(video, dim=0) # [t, c, h, w]
RuntimeError: stack expects a non-empty TensorList

vb000 commented 2 years ago

Hi,

Thanks for trying out our work!

This usually happens when the data directory structure or the file format is incorrect. Directory structure is unclear from the current readme. So, I updated the readme to clarify the same: https://github.com/vb000/NeuriCam#data-format. In your particular scenario, I think, you might be providing paths to videos, while the script expects paths to video sets. If so, you could simply move the videos to a sub-directory and rerun the same command. Following would fix the directory structure:

mkdir /content/NeuriCam/lrvideo/0
mv /content/NeuriCam/lrvideo/*.png /content/NeuriCam/lrvideo/0/*.png

This would allow you to evaluate multiple videos with a single command.

LinixLinux commented 2 years ago

Hi @vb000, thank you so much for your help, I really appreciate it.

I followed your instructions however I still am experiencing the same error. The only difference is that

Starting evaluation 0% 0/756 [00:00<?, ?it/s]

has changed to:

Starting evaluation 0% 0/2 [00:00<?, ?it/s]

I also tried changing my dataset names to match the readme, which returned the same results.

I want to add that I am running this on Colab (if that could have anything to do with this). The installation for mmcv-full initially failed. I set CUDA_HOME to the directory of Cuda 10.1 and installed the mmcv-full version for 10.1 if that makes any difference. I'll try on a local environment to see if it's a colab issue.

Thanks again for your help.

LinixLinux commented 2 years ago

I still haven't found a solution to this issue

To take the error literally: in dataset.py, is torch.stack not expecting a list or tuple of tensors in line 65? But since video in line 65 is empty ( video = [] ), it throws this error perhaps?

I'm not really sure what I'm doing though, since I'm not that familiar with python and just enjoy playing around with machine learning projects haha.

Thank you.

vb000 commented 2 years ago

Could you provide the paths to the video sets, and the command you’re using?

LinixLinux commented 2 years ago

Paths are:

/content/NeuriCam/hr-set/my-cat-video /content/NeuriCam/key-set/my-cat-video /content/NeuriCam/lr-set/my-cat-video

All the folders have an image sequence titled frame0.png, frame1.png, frame2.png, etc. Lr-set is in grayscale and the others are in colour. Images are 720x480.

The command is: python evaluate.py --lr_dir /content/NeuriCam/lr-set/ --key_dir /content/NeuriCam/key-set/ --target_dir /content/NeuriCam/hr-set/ --model_dir experiments/bix4_keyvsrc_attn/ --restore_file=pretrained --file_fmt=frame%d.png

vb000 commented 2 years ago

Could you try starting the frame numbers from 0 (not 1)? So that video sequences have filenames frame0.png, frame1.png, frame2.png and so on..

LinixLinux commented 2 years ago

Sorry, I was in a rush when I replied and made a mistake, the sequences are already titled with frame0.png, frame1.png, frame2.png, etc.

vb000 commented 2 years ago

Hi @LinixLinux, do key-frames of appropriate index exist in key-set? With key-frame interval 15, following frames must exist in key-set/<video name>/: frame0.png, frame15.png, frame30.png... (it doesn't matter whether frame[1-14].png exist or not, the script would not read them).

Clarified this in the 'Data Format' section in the readme..

vb000 commented 1 year ago

Closing this issue as there has been no activity...

LinixLinux commented 1 year ago

Thought I would come back to this to say that I got the code working, and that the issue was that I was using only 1 video set for each parameter rather than 3. lr-set, hr-set and key-set each need three folders with frames in them as that is what the code expects for evaluation. Thanks for all your help with this one @vb000 and my apologies for not reading correctly.

KeenNest commented 6 months ago

@LinixLinux can you please share the steps that you used to run this code ..

vb000 / NeuriCam

RuntimeError: stack expects a non-empty TensorList #2