Closed XinLan12138 closed 2 years ago
Just to be more specific, my dataset specification is as below:
elif 'pitts250k' in opt.dataset.lower():
dataset = Dataset('pitts250k', 'pitts250k_train_new.db', 'pitts250k_test_new.db', 'pitts250k_val_new.db', opt) # train, test, val structs
ref, qry = 'ref', 'qry'
ft1 = np.load(join(prefix_data,"descData/{}/pitts250k-{}.npy".format(opt.descType,ref)))
ft2 = np.load(join(prefix_data,"descData/{}/pitts250k-{}.npy".format(opt.descType,qry)))
trainInds, testInds, valInds = np.arange(10000), np.arange(10000,13000), np.arange(13000,16000)
dataset.trainInds = [trainInds, trainInds]
dataset.valInds = [valInds, valInds]
dataset.testInds = [testInds, testInds]
encoder_dim = dataset.loadPreComputedDescriptors(ft1,ft2)
Hi @XinLan12138,
db
files are missing utmDb
and utmQ
fields.N x 4096
, where N
is the number of images as listed in the corresponding files in imageNamesFiles
. Train/val/test splits use the indices (range in [0,N-1]
), which index the descriptor data corresponding to the images stored in their respective db
files. For Nordland, because of its simplicity, we directly define it through code. For Oxford, we load it separately. For MSLS, any split is defined for the whole city, so indices cover the whole range of data used from that city.Since all the datasets are somewhat unique, get_datasets.py
does the dataset-specific handling. What is applicable for the Nordland dataset (which has one-to-one frame correspondence across traverses) may not be right for others. So, your Pittsburgh settings as you have shared might work but could still be incorrect in its usage. Of all the the differences, the main one here is lack of sequential traverses in Pittsburgh, as also briefly discussed earlier.
May I ask what do you intend to do with the Pittsburgh dataset, so I might be able to point you in the right direction?
EDITs made to the second point description, check edit history please.
Hi @oravus . Through your description, I now understand the inner relationship of dataset specifications.
Initially, I consisted of using pitts250k because I also work on PyTorch version of Netvlad https://github.com/Nanne/pytorch-NetVlad, and hope to compare through pitts250k among your two models. I download pitts250k dataset specifications of their .mat structures and find utmDb and utmQ information there. I don't realize or understand your meaning about mentioned sequential traverses before :)
What I am going to do is try to compare several models like your sequence-based RGB image models, or pointnetvlad which is based on point cloud information, or models using point cloud projecting 2D image as data, etc. I am now doing a dissertation on it. Hope you can give me some suggestions about what should I focus on!
thanks for your patience!
Hi @XinLan12138,
With sequences I meant that we assume that images are collected as a data stream from a forward-facing camera mounted on a vehicle driving down a road. A part of this data stream can then be considered at any unique GPS co-ordinate as a short sequence of images, where this sequence will have overlapping views of the environment at that location. This is not the case with the Pitts250K dataset. So, it might not be possible to define image sequences in that dataset the way it is used in SeqNet (in line with prior similar work in this field). The code will index data in a sequence of L frames at every index, which might not be meaningful for Pitts250K.
If you were thinking something else, and I might have misunderstood, let me know.
Closing it now, please feel free to reopen if needed.
Hi @oravus , thanks for your help and I have generated the needed db files using pitts250k dataset. Generally, I use the whole dataset to generate descriptors and saved both query and ref .npy files. That is : ===> Loading dataset(s) All Db descs: (254064, 4096) All Qry descs: (24000, 4096)
Next, I use the first 10000 images as train sets, followed by 3000 images as validation and 3000 images as test sets. Also, I generated 3 .db files as train_mat_file, test_mat_file, val_mat_file.
Thereafter, I wrote the specification in get_datasets.py. The indexes are defined as nordland dataset format: trainInds, testInds, valInds = np.arange(10000), np.arange(10000,13000), np.arange(13000,16000)
And I think I can test the dataset using your pretrained model, but things comes out that still errors occur regarding index.
////////////////////////////////////////// Restored flags: ['--optim', 'SGD', '--lr', '0.0001', '--lrStep', '50', '--lrGamma', '0.5', '--weightDecay', '0.001', '--momentum', '0.9', '--seed', '123', '--runsPath', './data/runs', '--savePath', './data/runs/Jun03_15-22-44_l10_l10_w5_seqnetEnv/checkpoints', '--patience', '0', '--pooling', 'seqnet', '--w', '5', '--outDims', '4096', '--margin', '0.1'] Namespace(batchSize=16, cacheBatchSize=24, cachePath='./data/cache', cacheRefreshRate=0, ckpt='latest', dataset='pitts250k', descType='netvlad-pytorch', evalEvery=1, expName='0', extractOnly=False, lr=0.0001, lrGamma=0.5, lrStep=50.0, margin=0.1, mode='test', momentum=0.9, msls_trainCity='melbourne', msls_valCity='austin', nEpochs=200, nGPU=1, nocuda=False, numSamples2Project=-1, optim='SGD', outDims=4096, patience=0, pooling='seqnet', predictionsFile=None, resultsPath=None, resume='./data/runs/Jun03_15-22-44_l10_w5/', runsPath='./data/runs', savePath='./data/runs/Jun03_15-22-44_l10_l10_w5_seqnetEnv/checkpoints', seed=123, seqL=5, seqL_filterData=None, split='test', start_epoch=0, threads=8, w=5, weightDecay=0.001) ===> Loading dataset(s) All Db descs: (254064, 4096) All Qry descs: (24000, 4096) ===> Evaluating on test set ====> Query count: 800 ===> Building model => loading checkpoint './data/runs/Jun03_15-22-44_l10_w5/checkpoints/checkpoint.pth.tar' => loaded checkpoint './data/runs/Jun03_15-22-44_l10_w5/checkpoints/checkpoint.pth.tar' (epoch 200) ===> Running evaluation step ====> Extracting Features 20%|████████████████████████████████████▉ | 49/249 [00:00<00:01, 127.41it/s]==> Batch (50/250) 39%|█████████████████████████████████████████████████████████████████████████▏ | 97/249 [00:00<00:01, 148.19it/s]==> Batch (100/250) 58%|████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 145/249 [00:01<00:00, 154.13it/s]==> Batch (150/250) 78%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 193/249 [00:01<00:00, 156.80it/s]==> Batch (200/250) 97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 242/249 [00:01<00:00, 158.98it/s]==> Batch (250/250) Average batch time: 0.006786982536315918 0.009104941585046229
recallsOrDesc, dbEmb, qEmb, rAtL, preds = test(opt, model, encoder_dim, device, whole_test_set, writer, epoch, extract_noEval=opt.extractOnly)
File "/home/lx/lx/Seqnet_new/test.py", line 138, in test
rAtL.append(getRecallAtN(n_values, predictions, gtAtL))
File "/home/lx/lx/Seqnet_new/test.py", line 37, in getRecallAtN
if len(gt[qIx]) == 0:
IndexError: list index out of range
torch.Size([3000, 4096]) torch.Size([3000, 4096]) ====> Building faiss index ====> Calculating recall @ N Using Localization Radius: 25 Traceback (most recent call last): File "main.py", line 133, in
///////////////////////////////////////// I am writing to ask :
THANKS FOR HELP!