Closed Starry-lei closed 4 months ago
Hi Lei,
Yes the attention module is definitely the bottleneck in terms of memory and time complexity. You can try reducing the size of the input feature vector L or the dimensions of the intermediate SFA blocks in the attention module. Sparse attention may also help. I didn't perform a ton of ablations with different attention module architectures, but performance did seem fairly robust to changes in dimension size.
Jadie
Thanks for your prompt reply, I modified the attention to an memory-efficient attention from Xformer and use Mix Precision training, these two engineering modifications indeed help to train 2048 points faster and seems a little improve the best chamfer distance metric. If you allow me, I can contribute these two modifications to your Point2SSM and Point2SSM++.
I really like your SSM works :+1: Best, Lei
That's great! Yeah if you make a pull request I'd be happy to add your contributions.
Delighted to contribute to your work!
Hi Jadie,
Thank you for your great work. I've noticed that when I increase the number of output points to 2048 and use a larger dataset with about 800 samples, the training becomes quite slow. Do you have any suggestions for getting more correspondences like 2048 or more points on a larger dataset?
I know the time complexity of cross-attention is O(n**2), do you think sparse attention will help?
Best, Lei