sacmehta / ESPNet

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
https://sacmehta.github.io/ESPNet/
MIT License
541 stars 112 forks source link

Questions about hyperparameters #28

Closed msson closed 5 years ago

msson commented 5 years ago

Hello, First of all, your work is very impressive and helpful. Thank you!

I have a few questions below.

  1. Which language did you use for embedding your ESPNet codes on TX2? (Pytorch, C or any other language?)
  2. I can see the batch_size is 12 for ESPNet-C and 6 for ESPNet as a default. I am wondering there are some reasons for setting the ESPNet's batch_size is the half of ESPNet-C's.
  3. Also, hyperparameter p is 2 as fixed and q is changeable but it is also limited to 3, 5 and 8. If I would like to reduce an inference time, then is it okay to change those parameters like p=1, q=1? Why do you limit those parameters to p=2, q=3, 5, 8?

Thanks.

sacmehta commented 5 years ago

1) We used PyTorch. You can also convert PyTorch code to Caffe2 using ONNX, but we did not do that. 2) That is what I was able to fit on my GPU. 3) Yes, you can do so. But we did not tried at those settings.

msson commented 5 years ago

Thanks for your comment. I have a question about your last answer. I thought you have selected those parameters (p=2, q=3,5,8) for some reasons but you mean that I should try at all the parameters (p, q) to find a best one with my own dataset?

sacmehta commented 5 years ago

P and q have a direct impact on # parameters and model size. They are selected based on these two constraints. See page 22 in our paper

https://arxiv.org/pdf/1803.06815.pdf