Closed yuanmao closed 2 years ago
Thanks again for the great work. I have two questions related to the weight sharing implementation:
nn.module
cache_fn
@yuanmao Hi there! You caught a bug, my apologies in advance if this caused any inconveniences for your research :cry: I have fixed it in 0.8.1 https://github.com/lucidrains/perceiver-pytorch/releases/tag/0.8.1
Thanks again for the great work. I have two questions related to the weight sharing implementation:
nn.module
in the original paper(https://github.com/deepmind/deepmind-research/blob/826ff89f21e5143dc68ff7cb33f01cc6e237844d/perceiver/perceiver.py#L470); while here it usescache_fn
to achieve that. I'm not so sure if they are equivalent and if so, do you see any performance benefit of usingcache_fn
method?