Unable to read the dataset

amitkumarj441 commented 4 years ago

I have tried checking the config and data_loading_utils file, but still unable to fix this error. From the error,k it's not reading the dataset as the len definition if data_loading_utils returning zero. @nickgkan Could you please help to fix this error? Thanks!

/home/kgl/atr-net/faster_rcnn/model/utils/config.py:374: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  yaml_cfg = edict(yaml.load(f))
load checkpoint faster_rcnn/faster_rcnn_1_10_14657.pth
2020-01-14 11:58:45,650 - DEBUG - Tackling predcls for 1 classes
{'attention': 'multi_head', 'use_language': True, 'use_spatial': True}
load checkpoint faster_rcnn/faster_rcnn_1_10_14657.pth
2020-01-14 11:58:46,947 - INFO - Performing training for atr_net_predcls_VG80K
2020-01-14 11:58:47,904 - DEBUG - Set up dataset of 0 files
2020-01-14 11:58:47,904 - DEBUG - Set up dataset of 0 files
2020-01-14 11:58:47,904 - DEBUG - Set up dataset of 0 files
Traceback (most recent call last):
  File "main.py", line 113, in <module>
    main()
  File "main.py", line 110, in main
    model.train_test(cfg)
  File "/home/kgl/atr-net/src/models/atr_net.py", line 487, in train_test
    epochs=30 if config.use_early_stopping else 7)
  File "/home/kgl/atr-net/src/utils/train_test_utils.py", line 81, in train
    self._set_data_loaders()
  File "/home/kgl/atr-net/src/utils/train_test_utils.py", line 358, in _set_data_loaders
    for split in mode_ids
  File "/home/kgl/atr-net/src/utils/train_test_utils.py", line 358, in <dictcomp>
    for split in mode_ids
  File "/home/kgl/atr-net/src/utils/data_loading_utils.py", line 320, in __init__
    collate_fn=collate_fn)
  File "/home/kgl/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 802, in __init__
    sampler = RandomSampler(dataset)
  File "/home/kgl/.local/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 64, in __init__
    "value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integeral value, but got num_samples=0

nickgkan commented 4 years ago

Hi,

This line 2020-01-14 11:58:45,650 - DEBUG - Tackling predcls for 1 classes may imply that the image path is incorrect.

The number of classes is computed in config.py. If you have downloaded the VG80K annotations, vg80k_transformer_class will create the corresponding json files, but the method will ignore images that do not exist. In your case, probably no image exists where the program expects it to be, so it creates an empty dataset with a single 'background' class.

Please ensure that you have cloned the latest version of this codebase, as previously there was an error in the paths (see 1).

Thanks for your interest!

amitkumarj441 commented 4 years ago

Hi @nickgkan ,

Thanks for your prompt response.

I understood your point. But the image path is what's written in config.py and I arranged the downloaded images and created annotations in the proper path (which is at the home of atr-net).

At the first instance, all downloaded images were under scripts, but then according to config.py I moved it to atr-net home. Also, the json_annos folder is created after running prepare_data.py, so still figuring out that the path to the images are correct and this is a data loading problem. Have a look at the files/dir below. Also, I imported ORIG_IMAGES_PATH in src/utils/file_utils.py which you should fix as this is an issue in the current version of codebase.

LICENSE    UnRel  VRD          _init_paths.py  datasets     figures             json_annos  main.py  prepare_data.py  scripts
README.md  VG     __pycache__  config.py       faster_rcnn  glove.42B.300d.txt  losses      models   results          src

It would be great if you have any idea of the above error to fix.

Thanks in advance!

nickgkan commented 4 years ago

Thanks for pointing out the missing import @amitkumarj441, I changed this.

Yes, the project folder seems correct. Please check again the created json files, I am pretty sure they have no annotations inside them: e.g. I suspect that VG80K_predicates.json contains a list with a single element, while it should contain a list with all the predicate names. If that is the case, pull the code again (I fixed the missing import and added an extra utility in prepare_data.py and an assertion check) and run:

python prepare_data.py VG80K

(I just tested it with VRD as it is much faster, you can choose the dataset you want).

I hope everything runs smoothly!

deeplab-ai / atr-net

Unable to read the dataset #2