about kernel size - Githubissues

mit-han-lab / lite-transformer

[ICLR 2020] Lite Transformer with Long-Short Range Attention

https://arxiv.org/abs/2004.11886

Other

596 stars 81 forks source link

about kernel size #37

Closed sanwei111 closed 2 years ago

sanwei111 commented 3 years ago

parser.add_argument('--decoder-kernel-size-list', nargs='', default=[3, 7, 15, 31, 31, 31, 31], type=int) parser.add_argument('--encoder-kernel-size-list', nargs='', default=[3, 7, 15, 31, 31, 31, 31], type=int)

as you c，above code is about the param of kernel size around the 6 encoder or decoder layer，i just wonder that why they don‘t keep same for 6 layers？？？

Michaelvll commented 2 years ago

Thank you for asking! We follow the setup from the Pay Less Attention with Lightweight and Dynamic Convolutions.