Open xyyu-callen opened 5 years ago
Hi, I met the same error, could you please post your method to solve this problem? @xyyu-kevin @leeyeehoo
Hi
two instances of h5py trying to access a file at the same time and that causes an error. That's concurrent access to a .h5 file, which is not supported.
One workaround for this problem would be to reading all the data to your ram and training your regressor, which is a memory expensive approach of course. But then you could run your other network training with the traditional approach, simultaneously.
Another workaround might be duplicating your training resources. This means that you're gonna have to
This one would be heavy on the hard drive side. But would definitely work.
Hi, @leeyeehoo When I running two pieces of training, I encountered this issue.
Traceback (most recent call last): File "train.py", line 249, in
main()
File "train.py", line 99, in main train(train_list, model, criterion, optimizer, epoch) File "train.py", line 145, in train for i,(img, target)in enumerate(train_loader): File "/app/anaconda2/lib/python2.7/site-packages/torch/utils/data/dataloader.py", line 314, in next batch = self.collate_fn([self.dataset[i] for i in indices]) File "/disk1/kevin/prjs/crowdcounting/CSRNet-pytorch_seg/dataset.py", line 33, in getitem img,target = load_data(img_path,self.train) File "/disk1/kevin/prjs/crowdcounting/CSRNet-pytorch_seg/image.py", line 12, in load_data gt_file = h5py.File(gt_path) File "/app/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.py", line 269, in init fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr) File "/app/anaconda2/lib/python2.7/site-packages/h5py/_hl/files.py", line 124, in make_fid fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 98, in h5py.h5f.create IOError: Unable to create file (unable to open file: name = '/path/IMG_48.h5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)
What should I do about this issue? Many thanks.