Model size - Githubissues

mrchizx commented 4 years ago

Hi,

In the paper, it mentions that the model is ~31MB, but the official PWC-Net itself has ~41MB model. Is the model size measured in the paper does not include the PWC-Net?

sniklaus commented 4 years ago

Thank you for bringing this up!

This includes the weights of our entire pipeline, including PWC-Net. You are correct that the official PWC-Net is already bigger than that. We used a smaller PWC-Net as per my comments below.

1) The feature pyramid extractor in the PWC-Net paper [1] has two convolutions per feature level, the official implementation uses three levels instead. We followed the paper description.

2) The optical flow estimators in the PWC-Net paper [1] is not dense, the official implementation uses dense connections instead. We followed the paper description.

3) The optical flow estimators in the PWC-Net paper [1] receive the features from the first frame, the cost volume, and the previous flow estimate. The official implementation also inserts features from previous optical flow estimations (context). We followed the paper description.

4) We omitted the refinement/context network due to resource constraints on our end (we wanted to keep the computational footprint small to be able to run more experiments).

[1] https://arxiv.org/abs/1709.02371

mrchizx commented 4 years ago

Thanks the reply.

For the PWC-Net you implemented here: https://github.com/sniklaus/pytorch-pwc has model size of 35.7MB. Did you use this version or different one for this paper?

It would be more clear if you could provide the model size for the rest of the modules except PWC-Net.

Thanks

sniklaus commented 4 years ago

The implementation you linked is modeled after the official PWC-Net. As per my previous comment, we do not use the official implementation but what is described in the PWC-Net paper instead.

This alternative PWC-Net has 4198620 parameters, our synthesis network has 3266232 parameters, and our feature pyramid extractor has 204000 parameters.

mrchizx commented 4 years ago

Thank you very much for the detailed information.

zhuyeye commented 3 years ago

Hi, sniklaus. could you provide the training details of your smaller PWCNET, such as learning rate, training data, data augmentation ?

Thanks!

sniklaus commented 3 years ago

Enoder: six blocks with conv-lrelu-conv-lrelu where the first conv has a stride of 2 and the output channel size per block is 16, 32, 64, 96, 128, 192 respectively
Decoder: conv-lrelu-conv-lrelu-conv-lrelu-conv-lrelu-conv-lrelu-conv without any dense connections and the output channel size per conv is 128, 128, 96, 64, 32, 2 respecttively

All convs have a kernel size of 3, all leaky ReLUs have a negative slope of 0.1, and the model was trained with the usual augmentations (crop, rotate, adjust hue/contrast/saturation/brightness, etc). As for the learning rate, I would recommend you try a few to see what works best for you.

sniklaus / softmax-splatting

Model size #16