daifeng2016 / End-to-end-CD-for-VHR-satellite-image

The project aims to contribute to geoscience community
65 stars 22 forks source link

Questions about the paper #9

Open Bobholamovic opened 4 years ago

Bobholamovic commented 4 years ago

Many thanks for the code. I've read your paper and your work is impressive. However, in the experimentation part, I find three questions that confuses me:

  1. According to the original repo, the parameters of FC-EF is about 1.35M, as counted manually or via tools like torchsummary, but in your paper, it was 7.77M. Something similar happens in FC-Siam-conc and FC-Siam-diff. Does implementation difference (Pytorch and possibly, Keras?) account for the missing 6.42M parameters, or did you make any modifications on the original architecture, such as expanding the channels?

  2. There are training set, validation set, and testing set in the Lebedev dataset. By

The main reason lies in the fact that 70,000 training sets and 21,000 validation sets are employed after data augmentation, ...

, it seems that you've also performed data augmentation on the validation set. Did you train your model on the training set and validation set, and evaluate it on the test set?

  1. Another question is about the reported indexes. As far as I know, there are two ways to compute an average metric. The first one accumulates FP, TP, FN, and TNs over the whole dataset, before obtaining a total index. The second one calculate the index at every image, and then takes the average. With respect to this dataset, I found in my own experiments that the indexes counted in these two ways could be really distinct. Hence, can you tell me in which way you reported the indexes?

A bit long. Thank you.

daifeng2016 commented 4 years ago

Hi, as for your questions: 1) the kernel number of each convolution layer used in our paper is different from the original literature, for the high complexity of satellite images. 2) in our test, both the train data and the validataion data are used to from a novel training dataset for agumentation, then validation data is ramdom sampled. While no augmentation is done on the test data, which is only used for accuracy evaluation. 3) In theory, it is more accurate to compute the metric indexes based on all the datasets as long as you have enough memory. Otherwise, metric indexes should be computed based on single or batches of images.

Bobholamovic commented 4 years ago

Thank you very much for your answers. But it is still not clear to me on the third point about whether you computed the metric indexed based on all the datasets or on single or batches of images. Could you further elucidate it?