Closed lx709 closed 11 months ago
Hello @JonasSchult , Thanks for sharing the code of your nice work. I'm just wondering if you have checked the effect of using shared decoder transformer layers. Let's say will the performance decrease if we set shared_decoder=False.
Hi! Great question. In my experiments, the effect was rather minimal while saving quite some memory.
Best, Jonas
Thanks for your quick response, much appreciate that.
Hello @JonasSchult , Thanks for sharing the code of your nice work. I'm just wondering if you have checked the effect of using shared decoder transformer layers. Let's say will the performance decrease if we set shared_decoder=False.