Closed sryap closed 1 month ago
Name | Link |
---|---|
Latest commit | 7c4b2764b8638eff1e615f583dcdfa282199c270 |
Latest deploy log | https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/66ad6d9e1a208e00082cb34e |
Deploy Preview | https://deploy-preview-2929--pytorch-fbgemm-docs.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
This pull request was exported from Phabricator. Differential Revision: D60627636
This pull request was exported from Phabricator. Differential Revision: D60627636
This pull request was exported from Phabricator. Differential Revision: D60627636
This pull request has been merged in pytorch/FBGEMM@9cbf073787eca4ff5e296f2ea74fe6adbcd279eb.
Summary: Before this diff, there was a segmentation fault error (P1507485454) when running the SSD-TBE unit tests. It was caused by the premature tensor deallocation when the unit test invoked
set_cuda
. Sinceset_cuda
is non-blocking asynchronous, the unit test must ensure that the input tensors are alive untilset_cuda
is complete. However, the unit test allocated an input tensor inside a for-loop (in a stack memory). The tensor was deallocated as soon as each for-loop iteration was done -- causing segmentation fault.This diff fixes the problem by making sure that the input tensor is alive until
set_cuda
is complete by moving the scope of the tensor outside of the for-loop and adding a proper synchronization.Differential Revision: D60627636