Closed pawopawo closed 5 years ago
I can read the picture before. I don’t know what the reason is. Is there a problem with my converted dali data? But I was able to read the picture before. The code I use from https://github.com/NVIDIA/DALI/blob/master/docs/examples/pytorch/resnet50/main.py
Hi, thanks for the question.
How did you modify the data? Maybe you run into an issue similar to this problem with folder structure.
I've updated the docs for FileReader
in #1222 to make things clear in this regard.
Quote from the updated docs:
FileReader
supports flat directory structure.file_root
directory should contain directories with images in them. To obtain labelsFileReader
sorts directories infile_root
in alphabetical order and takes an index in this order as a class label.
[Warning]: File _train.lst has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _train.idx has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _train.rec has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _train.lst has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _train.idx has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _train.rec has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _train.lst has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _train.idx has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _train.rec has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _val.lst has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _val.rec has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm. [Warning]: File _val.idx has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm.
Is the ‘traindir’ in the HybridTrainPipe function not a converted dali data? Is the ‘traindir’ supposed to be the original imagenet dataset address (jpeg images)?
Above is the address of my traindir
What do you mean by converted dali data?
Yes, in this example traindir
is ultimately passed to FileReader
as a value for file_root
parameter. FileReader
reads jpeg files.
What do you mean by converted dali data? Yes, in this example
traindir
is ultimately passed toFileReader
as a value forfile_root
parameter.FileReader
reads jpeg files.
Thank you!
You're welcome. I'm closing the issue for now. If you have more questions or comments please do not hesitate to reopen or post another issue.
When I use DALI hybrid pipelines to train on Imagenet:
class HybridValPipe(Pipeline):
def __init__(self, batch_size, num_threads, device_id, data_dir, crop, size):
super(HybridValPipe, self).__init__(batch_size, num_threads, device_id, seed=12 + device_id)
self.input = ops.FileReader(file_root=data_dir, shard_id=args.local_rank, num_shards=args.world_size, random_shuffle=False)
self.decode = ops.ImageDecoder(device="mixed", output_type=types.RGB)
self.res = ops.Resize(device="gpu", resize_shorter=size, interp_type=types.INTERP_TRIANGULAR)
#self.cmnp = ops.CropMirrorNormalize(device="gpu", output_dtype=types.FLOAT, output_layout=types.NCHW, crop=(crop, crop),
#image_type=types.RGB, mean=[0.485 * 255, 0.456 * 255, 0.406 * 255], std=[0.229 * 255, 0.224 * 255, 0.225 * 255])
self.cmnp = ops.CropMirrorNormalize(device="gpu", output_dtype=types.FLOAT, output_layout=types.NCHW, crop=(crop, crop),
image_type=types.RGB, mean=[0, 0, 0], std=[255, 255, 255])
def define_graph(self):
self.jpegs, self.labels = self.input(name="Reader")
images = self.decode(self.jpegs)
images = self.res(images)
output = self.cmnp(images)
return [output, self.labels]
I'm seeing a bunch of errors (for both train and val pipelines):
[Warning]: File . has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm.
[Warning]: File .. has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm.
[Warning]: File . has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm.
[Warning]: File .. has extension that is not supproted by the decoder. Supported extensions: .jpg, .jpeg, .png, .gif, .bmp, .tif, .tiff, .pnm, .ppm, .pgm, .pbm.
These don't happen when I use:
train_dataset = datasets.ImageFolder(traindir, transforms.Compose([transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]))
val_dataset = datasets.ImageFolder(valdir, transforms.Compose([transforms.Resize(256),
transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]))
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=args.batch_size, shuffle=True, num_workers=args.workers, pin_memory=False)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=args.batch_size, shuffle=False, num_workers=args.workers, pin_memory=False)
(pointing to the same train and val directories). Using DALI 0.13. There are hundreds if not thousands of these errors, but it seems like it still reads the images - at the end it says read 50000 files from 1000 directories
for the validation subset (which is the correct length of the subset), despite the errors, and the model learns successfully.
Hi, DALI FIleReader assumes the following folder structure:
root_dir - class name - file
| \ -...
| \ - file N-th
\- class name - file
| \ -...
| \ - file N-th
...
This warning says that except training/validation images DALI detects files that are not supported. Usually, dot and dot-dot are not returned by the readdir
calls, but in this is your case. We will add a special case for that.
If a not recognized file is encountered, it is just skipped so don't worry about the final result.
BTW, this only happens when executed on a node in a cluster (on a mounted partition).
https://github.com/NVIDIA/DALI/pull/1318 should fix this problem
File "train_test.py", line 255, in main pipe.build() File "/usr/local/lib/python3.6/site-packages/nvidia/dali/pipeline.py", line 231, in build self._pipe.Build(self._names_and_devices) RuntimeError: [/opt/dali/dali/pipeline/operators/reader/loader/file_loader.h:108] Assert on "Size() > 0" failed: No files found. Stacktrace (28 entries): [frame 0]: /usr/local/lib/python3.6/site-packages/nvidia/dali/libdali.so(+0xafd5e) [0x7f71e5d7dd5e] [frame 1]: /usr/local/lib/python3.6/site-packages/nvidia/dali/libdali.so(+0x15768d) [0x7f71e5e2568d] [frame 2]: /usr/local/lib/python3.6/site-packages/nvidia/dali/libdali.so(+0x17d65f) [0x7f71e5e4b65f] [frame 3]: /usr/local/lib/python3.6/site-packages/nvidia/dali/libdali.so(std::_Function_handler<std::unique_ptr<dali::OperatorBase, std::default_delete > (dali::OpSpec const&), std::unique_ptr<dali::OperatorBase, std::default_delete > (*)(dali::OpSpec const&)>::_M_invoke(std::_Any_data const&, dali::OpSpec const&)+0xc) [0x7f71e5dd87fc]
[frame 4]: /usr/local/lib/python3.6/site-packages/nvidia/dali/libdali.so(+0x1c98c4) [0x7f71e5e978c4]
[frame 5]: /usr/local/lib/python3.6/site-packages/nvidia/dali/libdali.so(dali::InstantiateOperator(dali::OpSpec const&)+0x34e) [0x7f71e5e96cfe]
[frame 6]: /usr/local/lib/python3.6/site-packages/nvidia/dali/libdali.so(dali::OpGraph::InstantiateOperators()+0x8f) [0x7f71e5da2c6f]
[frame 7]: /usr/local/lib/python3.6/site-packages/nvidia/dali/libdali.so(dali::Pipeline::Build(std::vector<std::pair<std::string, std::string>, std::allocator<std::pair<std::string, std::string> > >)+0xd58) [0x7f71e5ef21b8]
[frame 8]: /usr/local/lib/python3.6/site-packages/nvidia/dali/backend_impl.cpython-36m-x86_64-linux-gnu.so(+0x37ecf) [0x7f71ea7eeecf]
[frame 9]: /usr/local/lib/python3.6/site-packages/nvidia/dali/backend_impl.cpython-36m-x86_64-linux-gnu.so(+0x21af3) [0x7f71ea7d8af3]
[frame 10]: /usr/local/lib/libpython3.6m.so.1.0(_PyCFunction_FastCallDict+0x16c) [0x7f71fa43898c]
[frame 11]: /usr/local/lib/libpython3.6m.so.1.0(+0x178b52) [0x7f71fa490b52]
[frame 12]: /usr/local/lib/libpython3.6m.so.1.0(_PyEval_EvalFrameDefault+0x337) [0x7f71fa488ea7]
[frame 13]: /usr/local/lib/libpython3.6m.so.1.0(+0x178d5a) [0x7f71fa490d5a]
[frame 14]: /usr/local/lib/libpython3.6m.so.1.0(+0x178ad7) [0x7f71fa490ad7]
[frame 15]: /usr/local/lib/libpython3.6m.so.1.0(_PyEval_EvalFrameDefault+0x337) [0x7f71fa488ea7]
[frame 16]: /usr/local/lib/libpython3.6m.so.1.0(+0x178d5a) [0x7f71fa490d5a]
[frame 17]: /usr/local/lib/libpython3.6m.so.1.0(+0x178ad7) [0x7f71fa490ad7]
[frame 18]: /usr/local/lib/libpython3.6m.so.1.0(_PyEval_EvalFrameDefault+0x337) [0x7f71fa488ea7]
[frame 19]: /usr/local/lib/libpython3.6m.so.1.0(PyEval_EvalCodeEx+0x21b) [0x7f71fa4872fb]
[frame 20]: /usr/local/lib/libpython3.6m.so.1.0(PyEval_EvalCode+0x1b) [0x7f71fa4870db]
[frame 21]: /usr/local/lib/libpython3.6m.so.1.0(+0x1ee1c2) [0x7f71fa5061c2]
[frame 22]: /usr/local/lib/libpython3.6m.so.1.0(PyRun_FileExFlags+0x9a) [0x7f71fa50663a]
[frame 23]: /usr/local/lib/libpython3.6m.so.1.0(PyRun_SimpleFileExFlags+0x1b7) [0x7f71fa5063f7]
[frame 24]: /usr/local/lib/libpython3.6m.so.1.0(Py_Main+0x6c7) [0x7f71fa50d1e7]
[frame 25]: python3(main+0x105) [0x400b05]
[frame 26]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f71f9669b45]
[frame 27]: python3() [0x400c51]