Open emd4600 opened 1 year ago
I'm getting the same error.
Using collated_batch = get_dict_to_torch(collated_batch, device=self.device, exclude=["image", "mask"])
works for me but is quite slow. Using collated_batch = get_dict_to_torch(collated_batch, device=self.device)
would be faster if your GPU memory is big enough.
Confirmed, thanks for the suggestion @nepfaff - with semantic-nerfw, looks like changes from #1467 are also needed, but having everything on GPU is indeed both working and fast.
Does latest https://github.com/nerfstudio-project/nerfstudio/commit/a91b92f89d401115b42e6ad295a8b850e02ee86f solve this for others as well? If so, we can close.
I am having the same issue trying to train nerfacto with masks at version 0.19
Well, if you want to make it work, just add three lines: c = c.cpu() y = y.cpu() x = x.cpu() to nerfstudio/data/pixel_sampler.py after line 98.
The author seems to put the masks to CUDA to acclerate the pixel sampling so the sampled indices are on CUDA too, using indices on CUDA to index images on CPU causes this issue.
Describe the bug It is impossible to train datasets with mask on Nerfacto due to a runtime error.
To Reproduce Steps to reproduce the behavior:
mask_path
Expected behavior After the dataset loads, the following error appears:
Potential solution The error happens because the pixel sampler uses data from the mask image to get the indices to sample the RGB image itself. But the RGB image is on the CPU, whereas the mask is on the GPU. This is caused by the
_get_collated_batch()
method inCacheDataloader
, which moves all batch data (including the mask) to the GPU, except the image. This could be fixed by changingdataloaders.py:110
to:(and maybe change in line 186 as well, if masking is used in evaluation)