Is the resolution of Places dataset you used 256*256?

rpng / calc

Convolutional Autoencoder for Loop Closure

BSD 3-Clause "New" or "Revised" License

191 stars 45 forks source link

Is the resolution of Places dataset you used 256*256? #3

Closed qcl1994 closed 6 years ago

qcl1994 commented 6 years ago

Hi, I use the Places dataset(256*256) to train the convolutional autoencoder, but the train loss decline from 130 to about 60, and the loss could not decline. Could u tell me what's wrong? Thank u so much

nmerrill67 commented 6 years ago

Yes I used the 256^2 images to save time and disk space. I experienced similar loss behavior in training. I would not worry about the loss behavior so much as the behavior of the descriptor layer from the snapshotted weights. If you plot the precision-recall from multiple snapshots you should see them behave differently despite a similar average loss value.

qcl1994 commented 6 years ago

Thanks for quick response! But I still can not understand why loss did not go down but the system achieved a better performance.

nmerrill67 commented 6 years ago

In the end it doesn't really matter if the net learns the HOG descriptor for each warped image, because the HOG here is not very good at place recognition--so a decrease in loss may not be beneficial. The intermediate features are what we want, and since the loss is nonzero, the propagated gradient is nonzero as well. Therefore we have weights being updated every iteration. It turns out that the HOG provides just enough of a geometric prior to update the weights beneficially up to a certain point if there is sufficient training data.

qcl1994 commented 6 years ago

Thanks for your reply! Another question, why you used hog, not Gist or other global image descriptor

nmerrill67 commented 6 years ago

That's a good question! We did cross validate a few types of descriptors, and also tried using just the images and no descriptor, but we did not try Gist. We chose HOG as one possibilty since alot of recent papers have used HOG to achieve some incredible place recognition results (going with accuracy over efficiency). We cite a few in our paper. It is infeasible to try every type of descriptor since there are so many out there. Since this feed is really for issue tracking, I think we should continue over email if you have further questions (nmerrill@udel.edu)