Closed sqw475sqw closed 3 years ago
Your line numbers dont match up with the repo. I assume whatever change you made around here: https://github.com/Nanne/pytorch-NetVlad/blob/master/main.py#L85-L101 makes it so the training triplet gets stuck on trying to index the hdf5.
thanks for your reply.
I don’t seem to modify those codes you mention above. I doubt if the cachePatch path is wrong,because I have changed the instruction :python main.py --mode=train --arch=vgg16 --pooling=netvlad --num_clusters=64
to the instruction:
python main.py --mode=train --arch=vgg16 --pooling=netvlad --num_clusters=64 --cachePath=/home/sqw/Desktop/pytorch-NetVlad-master
and I have changed your codes in main.py parser.add_argument('--cachePath', type=str, default=environ['TMPDIR'], help='Path to save cache to.')
to parser.add_argument('--cachePath', type=str, help='Path to save cache to.')
when I get error as follows:keyerror: 'tmpdir'
As details,As soon as I started running the construction:
python main.py --mode=cluster --arch=vgg16 --pooling=netvlad --num_clusters=64
an error was reported as follows:
parser.add_argument('--cachePath', type=str, default=environ['TMPDIR'], help='Path to save cache to.') File "/home/sqw/anaconda3/envs/pytorch1.4/lib/python3.7/os.py", line 681, in __getitem__ raise KeyError(key) from None KeyError: 'TMPDIR'
what should I do?could you give me some hints?
The errors you're getting are quite descriptive. What's happening is that you need to specify a path where the feature cache can be stored. This is the --cachePath argument. By default it tries to use the location of the tmp dir, as found in environ['TMPDIR']. If this environment variable isn't specified then you might get a key error.
To circumvent this error you can call main.py with a valid path for the cachePath, for example you can overwrite the default by calling main.py with --cachePath=.
to use the current directory for caching.
Anyway, this is not a support forum for how to use this code base (especially not when you've started modifying it), please only open an issue to report bugs etc.
I think the h5py version caused the problem. When I use the 3.5.1 version of h5py, the same error occurs. The 2.8.0 version of h5py can solve this problem!
Thanks for your great work!
I get errors after run the instruction:
python main.py --mode=train --arch=vgg16 --pooling=netvlad --num_clusters=64
the errors as follows: `====> Building Cache Allocated: 60039168 Cached: 9596567552 /home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) /home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) /home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) /home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) /home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) /home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) /home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) /home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) Traceback (most recent call last): File "main.py", line 515, in
train(epoch)
File "main.py", line 116, in train
negCounts, indices) in enumerate(training_data_loader, startIter):
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in next
data = self._next_data()
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
return self._process_data(data)
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
data.reraise()
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/torch/utils/data/dataset.py", line 257, in getitem
return self.dataset[self.indices[idx]]
File "/home/sqw/Desktop/pytorch-NetVlad-master/pittsburgh.py", line 230, in getitem
negFeat = h5feat[negSample.tolist()]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/h5py/_hl/dataset.py", line 777, in getitem
selection = sel.select(self.shape, args, dataset=self)
File "/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/h5py/_hl/selections.py", line 82, in select
return selector.make_selection(args)
File "h5py/_selector.pyx", line 272, in h5py._selector.Selector.make_selection
File "h5py/_selector.pyx", line 183, in h5py._selector.Selector.apply_args
TypeError: Indexing arrays must have integer dtypes
/home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) /home/sqw/anaconda3/envs/pytorch1.4.0/lib/python3.7/site-packages/sklearn/neighbors/_base.py:622: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1 n_jobs = effective_n_jobs(self.n_jobs) `
I doubt if the pitts datasets is misplaced.