nerfstudio-project / nerfstudio

A collaboration friendly studio for NeRFs
https://docs.nerf.studio
Apache License 2.0
9.45k stars 1.29k forks source link

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory #1117

Open 0LEL0 opened 1 year ago

0LEL0 commented 1 year ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

0LEL0 commented 1 year ago

The dataset 'poster' I had already put it into /data/nerfstudio, but when I used 'ns-train nerfacto --data data/nerfstudio/poster' but got the RuntimeError like this. How can I solve it? My environment is used the same as Readme, but one thing is that mu cuda version is 11.8, it that cause the problem? If not what should I do?

tancik commented 1 year ago

Can you post the entire stack trace. Also is the data/nerfstudio/poster folder a .zip file or a folder with transforms.json and other folders in it?

0LEL0 commented 1 year ago

Thank you for your reply. The data/nerfstudio/poster is a folder with transforms.json and other folders in it. The zip file look like it was automatically deleted.

tancik commented 1 year ago

Can you post the entire stack trace to get a better idea where the code is failing.

0LEL0 commented 1 year ago

Oh,I am sorry, I just sold out my GPU and try to replace a new one so I can not post it now. When I get it down I would post it then. But I remember it return this error after loading some files. Thanks for your help!

MoZhenJ commented 1 year ago

hello, I have same problem Traceback (most recent call last): File "/home/mitor/anaconda3/envs/nerfstudio/bin/ns-train", line 8, in sys.exit(entrypoint()) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/scripts/train.py", line 247, in entrypoint main( File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/scripts/train.py", line 233, in main launch( File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/scripts/train.py", line 172, in launch main_func(local_rank=0, world_size=world_size, config=config) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/scripts/train.py", line 86, in train_loop trainer.setup() File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfstudio/engine/trainer.py", line 145, in setup self.pipeline = self.config.pipeline.setup( File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfstudio/configs/base_config.py", line 57, in setup return self._target(self, kwargs) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfstudio/pipelines/base_pipeline.py", line 229, in init self._model = config.model.setup( File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfstudio/configs/base_config.py", line 57, in setup return self._target(self, kwargs) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfstudio/models/base_model.py", line 82, in init self.populate_modules() # populate the modules File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/nerfstudio/models/nerfacto.py", line 196, in populate_modules self.lpips = LearnedPerceptualImagePatchSimilarity() File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torchmetrics/image/lpip.py", line 125, in init self.net = NoTrainLpips(net=net_type, verbose=False) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/lpips/lpips.py", line 84, in init self.net = net_type(pretrained=not self.pnet_rand, requires_grad=self.pnet_tune) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/lpips/pretrained_networks.py", line 59, in init alexnet_pretrained_features = tv.alexnet(pretrained=pretrained).features File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torchvision/models/_utils.py", line 142, in wrapper return fn(*args, *kwargs) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torchvision/models/_utils.py", line 228, in inner_wrapper return builder(args, **kwargs) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torchvision/models/alexnet.py", line 114, in alexnet model.load_state_dict(weights.get_state_dict(progress=progress)) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torchvision/models/_api.py", line 63, in get_state_dict return load_state_dict_from_url(self.url, progress=progress) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/hub.py", line 731, in load_state_dict_from_url return torch.load(cached_file, map_location=map_location) File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/serialization.py", line 705, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "/home/mitor/anaconda3/envs/nerfstudio/lib/python3.8/site-packages/torch/serialization.py", line 242, in init super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

tancik commented 1 year ago

It looks like the issue is that the LPIPs models wasn't downloaded completely. We use torchmetrics lpips which depends on the original LPIPS. Maybe try pip uninstall lpips, followed by pip install lpips

MoZhenJ commented 1 year ago

It looks like the issue is that the LPIPs models wasn't downloaded completely. We use torchmetrics lpips which depends on the original LPIPS. Maybe try pip uninstall lpips, followed by pip install lpips

Thanks, but it don't solve the problem, when I want to try the nerfstudio, run command "ns-train nerfacto --data data/nerfstudio/poster", it began download the .ckpt weight file, I didn't download it completely, I think this cause the problem. How can I download the file.

MoZhenJ commented 1 year ago

Thanks, I solve the problem. The torch model didn't download completely. You can delete the ~/.cache/torch/hub/checkpoints/alexnet-owt-7be5be79.pth. Try again command "ns-train nerfacto --data data/nerfstudio/poster"