Closed qixuema closed 1 year ago
In current implementation, it is not possible for OctreeAttention to partition a patch at a specified depth using nodes from two adjacent batches.
I apologize for my unclear English expression. Please forgive me. I would like to inquire about the behavior of the nodes at the same depth for two adjacent training samples within the same batch during the training process. Would they exhibit the connected pattern depicted in the diagram below?
Or is it the case that during training, each training sample undergoes its own independent padding (blue)?
Thanks for your clarification. The Attention runs according to the following figure.
But I build a mask to erase the overlapped attention values to ensure the consecutive samples in the batch processed seperately. https://github.com/octree-nn/octformer/blob/d727ba08fb801c7340784440ef5e49ec4c53218a/models/octformer.py#L55
I understand now, creating a mask is indeed a logical and practical solution. I am sincerely grateful for your clear explanation. Thank you immensely!
Hi Professor Wang,
When the model is training, assuming batch_size = 32, would it be possible for OctreeAttention to partition a patch at a specified depth using nodes from two adjacent batches? If this situation occurs, how should the self-attention be interpreted in this case?
I would greatly appreciate your insights on this matter. I look forward to hearing from you at your earliest convenience.