ActiveVisionLab / DFNet

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching (ECCV 2022)
https://dfnet.active.vision
MIT License
95 stars 9 forks source link

Training set data question #8

Closed lck666666 closed 1 year ago

lck666666 commented 1 year ago

Since your Nerf model is not in the same coordinate system as the original training set, is there any way to get the coordinates of the training set under your coordinate system?

lck666666 commented 1 year ago

I just change the folder name of your train set to test test, and the test set to train set for each dataset. Why did your model perform so much worse on the training set than on the test set? For example, exchange ../data/Cambridge/KingsCollege/test/ to ../data/Cambridge/KingsCollege/train/

../data/Cambridge/KingsCollege/train/ to ../data/Cambridge/KingsCollege/test/

lck666666 commented 1 year ago
image

For example, Cambridge king has 1220 images in training set and 343 in test set. After I change the folder name of the Kings, then I run evaluation of your pretraiend DFnet-dm model, the result on training set (1220 images) is not so good . However, if I keep the original 343 test set, the result is close to your paper. I am confused that why your pretrained models perform worse in training set than test set.

image
lck666666 commented 1 year ago

image Cambridge Hospital has 895 images in training set and 182 in test set. After I change the folder name of the hospital, then I run evaluation of your pretraiend DFnet-dm model, the result on training set (895 images) is not so good . However, if I keep the original 182 test set, the result is close to your paper. I am confused that why your pretrained models perform much worse in training set than test set.

image
lck666666 commented 1 year ago

image

Cambridge Shop has 231 images in training set and 103 in test set. After I change the folder name of the Shop, then I run evaluation of your pretraiend DFnet-dm model, the result on training set (231 images) is not so good . However, if I keep the original 103 test set, the result is close to your paper. I am confused that why your pretrained models perform much worse in training set than test set.

image
lck666666 commented 1 year ago

The performance of your pretrained models differs by a factor of 5 to 10 on the training and test sets. Could you please explain why?

lck666666 commented 1 year ago

Another question is that we do not use --eval and set eval = false, and add some print lines to get the image_name when training in DFNet/dataset_loaders/cambridge_scenes.py, image just run train.py without eval in both config files and the command line, the print names of image are from test set without exchanging the folder name of train and test.

lck666666 commented 1 year ago

@LZL-CS I see you've gone deep into the code. I don't know if you're interested in this.

chenusc11 commented 1 year ago

Hi,

  1. coordinate system: the NeRF coordinate system is OpenGL. We transform the system within the dataloading files (fix_ccord). If you want to reverse it, you just need to reverse the process. However, I don't know the coordinate system of the original datasets as I didn't find any description from them.

  2. Training set results. This should be expected. The insights of the unlabeled training experiments for our paper is to show that our formulation can finetune unlabeled images without using ground truth poses. We'd encourage you to checkout sec.4.2 and sec.4.3 of the paper.

If you really want to get both good training and testing results, there's a path. I'd encourage you to try the combined training strategy proposed by the authors of MapNet [4], On MapNet paper section 3.3: "each mini-batch samples half from the labelled data D and half from the unlabeled data T".

In our case, for example, you could use DFNet training (training images+ training GT poses) + Unlabeled training (part of val/testing images w/o GT poses).

In short, training the labeled data is basically overfitting them to the network. We didn't implement this in our unlabeled training experiments as the overall training could be much longer due to our limited computation resources.

  1. We follow suit with MapNet's comparison methods section. 4.1 of MapNet: "The unlabeled data used to fine-tune MapNet+ for these experiments are the unlabeled test sequences." This is probably why you see the test set being loaded.

One of the interesting perspectives of thinking about DFNet_dm is an improved version of APR + iNeRF.

Hope this helps!

Best,

lck666666 commented 1 year ago

Hi,

  1. coordinate system: the NeRF coordinate system is OpenGL. We transform the system within the dataloading files (fix_ccord). If you want to reverse it, you just need to reverse the process. However, I don't know the coordinate system of the original datasets as I didn't find any description from them.
  2. Training set results. This should be expected. The insights of the unlabeled training experiments for our paper is to show that our formulation can finetune unlabeled images without using ground truth poses. We'd encourage you to checkout sec.4.2 and sec.4.3 of the paper.

If you really want to get both good training and testing results, there's a path. I'd encourage you to try the combined training strategy proposed by the authors of MapNet [4], On MapNet paper section 3.3: "each mini-batch samples half from the labelled data D and half from the unlabeled data T".

In our case, for example, you could use DFNet training (training images+ training GT poses) + Unlabeled training (part of val/testing images w/o GT poses).

In short, training the labeled data is basically overfitting them to the network. We didn't implement this in our unlabeled training experiments as the overall training could be much longer due to our limited computation resources.

  1. We follow suit with MapNet's comparison methods section. 4.1 of MapNet: "The unlabeled data used to fine-tune MapNet+ for these experiments are the unlabeled test sequences." This is probably why you see the test set being loaded.

One of the interesting perspectives of thinking about DFNet_dm is an improved version of APR + iNeRF.

Hope this helps!

Best,

Thanks for your reply!