Nanne / pytorch-NetVlad

Pytorch implementation of NetVlad including training on Pittsburgh.
427 stars 110 forks source link

Running "python main.py --mode=train --arch=vgg16 --pooling=netvlad --num_clusters=64" leads to this error. Help? #86

Closed taiyipan closed 1 year ago

taiyipan commented 1 year ago
python main.py --mode=train --arch=vgg16 --pooling=netvlad --num_clusters=64
Namespace(arch='vgg16', batchSize=4, cacheBatchSize=24, cachePath='/home/taiyi/repository2/event_vpr_methods/test_netvlad/cache/', cacheRefreshRate=1000, ckpt='latest', dataPath='/home/taiyi/repository2/event_vpr_methods/test_netvlad/data/', dataset='pittsburgh', evalEvery=1, fromscratch=False, lr=0.0001, lrGamma=0.5, lrStep=5, margin=0.1, mode='train', momentum=0.9, nEpochs=30, nGPU=1, nocuda=False, num_clusters=64, optim='SGD', patience=10, pooling='netvlad', resume='', runsPath='/home/taiyi/repository2/event_vpr_methods/test_netvlad/runs/', savePath='checkpoints', seed=123, split='val', start_epoch=0, threads=16, vladv2=False, weightDecay=0.001)
===> Loading dataset(s)
====> Training query set: 7320
===> Evaluating on val set, query count: 7608
===> Building model
===> Training model
===> Saving state to: /home/taiyi/repository2/event_vpr_methods/test_netvlad/runs/May12_23-37-02_vgg16_netvlad
/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/optim/lr_scheduler.py:139: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  warnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`. "
/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/optim/lr_scheduler.py:152: UserWarning: The epoch parameter in `scheduler.step()` was not necessary and is being deprecated where possible. Please use `scheduler.step()` to step the scheduler. During the deprecation, if epoch is different from None, the closed form is used instead of the new chainable form, where available. Please open an issue if you are unable to replicate your use case: https://github.com/pytorch/pytorch/issues/new/choose.
  warnings.warn(EPOCH_DEPRECATION_WARNING, UserWarning)
====> Building Cache
Allocated: 60039168
/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/cuda/memory.py:416: FutureWarning: torch.cuda.memory_cached has been renamed to torch.cuda.memory_reserved
  warnings.warn(
Cached: 5819596800
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
/home/taiyi/event_vpr/lib/python3.8/site-packages/sklearn/neighbors/_base.py:815: UserWarning: Loky-backed parallel loops cannot be called in a multiprocessing, setting n_jobs=1
  n_jobs = effective_n_jobs(self.n_jobs)
Traceback (most recent call last):
  File "main.py", line 511, in <module>
    train(epoch)
  File "main.py", line 113, in train
    for iteration, (query, positives, negatives, negCounts, indices) in enumerate(training_data_loader, startIter):
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 634, in __next__
    data = self._next_data()
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/_utils.py", line 644, in reraise
    raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/torch/utils/data/dataset.py", line 298, in __getitem__
    return self.dataset[self.indices[idx]]
  File "/mnt/e6c9de53-1f7f-424a-a71d-7a8cf8e2e0ee/event_vpr_methods/test_netvlad/pittsburgh.py", line 230, in __getitem__
    negFeat = h5feat[negSample.tolist()]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/h5py/_hl/dataset.py", line 841, in __getitem__
    selection = sel.select(self.shape, args, dataset=self)
  File "/home/taiyi/event_vpr/lib/python3.8/site-packages/h5py/_hl/selections.py", line 82, in select
    return selector.make_selection(args)
  File "h5py/_selector.pyx", line 282, in h5py._selector.Selector.make_selection
  File "h5py/_selector.pyx", line 197, in h5py._selector.Selector.apply_args
TypeError: Indexing arrays must have integer dtypes
oeg1n18 commented 1 year ago

Hi there I think the issue may be in getitem function of the QueryDatasetFromStruct class.

In the function h5feat is indexed bye negSample.tolist().

negFeat = h5feat[negSample.tolist()]

negSample.tolist() however is a list of floats.

re-writing this line to: negFeat = h5feat[list(map(int, negSample))]

appears to solve the issue for me.

taiyipan commented 1 year ago

@oeg1n18 Thanks! I changed that 1 line of code and now it can run training loops finally.

oeg1n18 commented 1 year ago

No Problem. That's great!


From: Taiyi Pan @.> Sent: Sunday, May 28, 2023 11:12 PM To: Nanne/pytorch-NetVlad @.> Cc: Oliver Grainge @.>; Mention @.> Subject: Re: [Nanne/pytorch-NetVlad] Running "python main.py --mode=train --arch=vgg16 --pooling=netvlad --num_clusters=64" leads to this error. Help? (Issue #86)

CAUTION: This e-mail originated outside the University of Southampton.

@oeg1n18https://github.com/oeg1n18 Thanks! I changed that 1 line of code and now it can run training loops finally.

— Reply to this email directly, view it on GitHubhttps://github.com/Nanne/pytorch-NetVlad/issues/86#issuecomment-1566282204, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AK7O5OWKEN6UMLKWZUJ754LXIPEU3ANCNFSM6AAAAAAYAHMOWU. You are receiving this because you were mentioned.Message ID: @.***>