Closed moshicaixi closed 2 years ago
Hi, I also find the problem, and finally i realized that the author tried to flattern data sequences with different lengths into one sequence. For example, if batch_size is set to 4, and three batches all load 8000 points separately, but the remaining batch only load 7000 points, and default collate_fn methods will try to get (4, 8000, 3), but this will cause error since one batch only have 7000 points, so it is better to flattern these points into (8000*3+7000, 3) and use a offset sequence to record the end of each batch.
Hi, I also find the problem, and finally i realized that the author tried to flattern data sequences with different lengths into one sequence. For example, if batch_size is set to 4, and three batches all load 8000 points separately, but the remaining batch only load 7000 points, and default collate_fn methods will try to get (4, 8000, 3), but this will cause error since one batch only have 7000 points, so it is better to flattern these points into (8000*3+7000, 3) and use a offset sequence to record the end of each batch.
Hi, thanks for your reply. Has the flatten process done in data preprocessing when dataloader load the data? Maybe I need to read the codes from scratch.
I agree, so they use offset to tell the length of each shapes's point cloud
Hi @moshicaixi,
Sorry for the late reply 🙏 . The comment of @yifliu3 is exactly right. Using offsets makes it possible to construct a mini-batch of point clouds whose cardinalities are not the same. By the way, the nearest neighbor search on those offset-informed mini-batches is implemented by the first author of Point Transformer.
Hope this helps your understanding 😄 .
Note that the offset is the boundary, not the base. For the above example with (8000, 8000, 8000, 7000) points, set o = [8000, 16000, 24000, 31000] If you input base offset (e.g. [0, 8000, ...]) you will get an illegal memory access CUDA error or something in pointops_cuda.
Hi, thanks very much for your excellent codes. But I am curious about the input shape of Transformer layer, as depicted in the following picture. I am confused that there is no batch dimension, and what is o(offset)? In other papers, the point cloud shape is usually represented as (B C N) or (B N C), which is easy to understand. But I got confusion when reading your codes. Hoping to your guide!