I used your algorithm for the calculation of the 3D IoU to implement 3D NMS on CUDA. Whilst experimenting, I occasionally received Invalid __local__ write of size 16 bytes errors which I then tracked down to this line: https://github.com/facebookresearch/pytorch3d/blob/fe0b1bae49e7144021a9eb63169e855f51dd4dd3/pytorch3d/csrc/iou_box3d/iou_utils.cuh#L733 by using the compute-sanitizer. I initially thought the issue was within my modification but a quick breakdown reveals that the index may indeed exceed the limit of MAX_TRIS=100 (see the snippet at the bottom where I just assumed that ClipTriByPlane() -> 2).
I just wanted to let you know, even though it appears that the limit seems sufficient in most practical usecases.
Kind regards
Enrico
n_max = 0
num_tris = 12
for p in range(6):
offset = 0
for t in range(num_tris):
count = 2
for v in range(count):
offset += 1
num_tris = offset
for j in range(num_tris):
n_max = max(n_max, j)
print(n_max) # 767
Hi guys,
I used your algorithm for the calculation of the 3D IoU to implement 3D NMS on CUDA. Whilst experimenting, I occasionally received
Invalid __local__ write of size 16 bytes
errors which I then tracked down to this line: https://github.com/facebookresearch/pytorch3d/blob/fe0b1bae49e7144021a9eb63169e855f51dd4dd3/pytorch3d/csrc/iou_box3d/iou_utils.cuh#L733 by using the compute-sanitizer. I initially thought the issue was within my modification but a quick breakdown reveals that the index may indeed exceed the limit ofMAX_TRIS=100
(see the snippet at the bottom where I just assumed thatClipTriByPlane() -> 2
).I just wanted to let you know, even though it appears that the limit seems sufficient in most practical usecases.
Kind regards Enrico