Open zxpdemonio opened 3 weeks ago
Looks good. We will merge it soon.
Is the patch submitted to upstream? I couldn't find it at: https://lists.freedesktop.org/archives/amd-gfx/2024-October/ https://lists.freedesktop.org/archives/amd-gfx/2024-November/
PeerDirect isn't upstreamable, so it cannot go to amd-staging-drm-next (which is what the amd-gfx mailing list is for). You'll notice that kfd_peerdirect.c isn't even present in the amd-staging-drm-next branch. However, I can confirm that this patch was picked internally, and will be in the upcoming ROCm release
In function amd_acquire(), kfd_get_process() is call to get process. When judge whether we get an exception pointer, we shouldn't judge whether it's a null pointer, because kfd_get_process will return ERR_PTR(-EINVAL) instead of null pointer if error.
Because of this wrong logic, the kernel will panic then once kfd_get_process() returns ERR_PTR(-EINVAL).
So, the correct logic should be: if (IS_ERR(p)) {
Fixes: commit 779b4d05a1c9("drm/amdkfd: Add RDMA and PeerDirect support") fixes: #175