Closed lileaLab closed 4 months ago
In theory, our code should work on a Windows environment with PyTorch, provided that all dependencies are properly installed. The error you encountered seems to be due to a missing val_dataset. I have fixed the issue in the dataset, and you might need to download the dataset again. Please try this and let me know if you encounter any further issues!
I used the "feicuiwan_sample_folder_full_images.zip" one and the process went fine!
But this time I got an "AttributeError" and "EOFError: Ran out of input" error.
[Counter] reset counter -> 157773
[Corrector] view correction optimizer setup 0.001
val 1: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:01<00:00, 5.25it/s]
>>> Validation: 1: 7 images
- l1: 0.1753
- psnr: 13.2273
- lpips: 0.9336
Gaussian 157773 points
radius [0.0035~0.0133~0.0556]
opacity: 0.10, 0 < 0.05, 0 < 0.1,
Traceback (most recent call last):
File "E:\Git\lileaLab\GaussianSplatting_LoG\LoG\apps\train.py", line 157, in <module>
main()
File "E:\Git\lileaLab\GaussianSplatting_LoG\LoG\apps\train.py", line 140, in main
trainer.fit(dataset)
File "e:\git\lilealab\gaussiansplatting_log\log\LoG\utils\trainer.py", line 486, in fit
for iteration, data in enumerate(trainloader):
File "C:\Users\ryo\anaconda3\envs\LoG\lib\site-packages\torch\utils\data\dataloader.py", line 441, in __iter__
return self._get_iterator()
File "C:\Users\ryo\anaconda3\envs\LoG\lib\site-packages\torch\utils\data\dataloader.py", line 388, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\ryo\anaconda3\envs\LoG\lib\site-packages\torch\utils\data\dataloader.py", line 1042, in __init__
w.start()
File "C:\Users\ryo\anaconda3\envs\LoG\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\ryo\anaconda3\envs\LoG\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\ryo\anaconda3\envs\LoG\lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
File "C:\Users\ryo\anaconda3\envs\LoG\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
reduction.dump(process_obj, to_child)
File "C:\Users\ryo\anaconda3\envs\LoG\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'Trainer.train_loader.<locals>.worker_init_fn'
(LoG) E:\Git\lileaLab\GaussianSplatting_LoG\LoG>Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\ryo\anaconda3\envs\LoG\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\ryo\anaconda3\envs\LoG\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
By the way, before entering the training process, I pre-processed the data with the command python3 apps/test_pointcloud.py --cfg config/example/test/dataset.yml split dataset radius 0.01
, but at this time I was getting the following error that ffmpeg could not run. Is it possible that this error is due to the processing stopping here? I thought it would just be that the video could not be generated, so I ignored it for the moment.
(LoG) E:\Git\lileaLab\GaussianSplatting_LoG\LoG>python apps/test_pointcloud.py --cfg config/example/test/dataset.yml split dataset radius 0.01
Key is not in the template: split
Key is not in the template: radius
[Config] replace key $root
[Config] replace key $scale3d
[Config] replace key $root
[Config] replace key $scale3d
[Config] replace key $PLYNAME
[Config] replace key $PLYNAME
[Config] replace key $PLYNAME
[ImageDataset] set scales: [1, 2, 4, 8], crop size: [-1, -1]
[ImageDataset] cache dir: E:\Git\lileaLab\GaussianSplatting_LoG\LoG\data\feicuiwan_sample_folder\cache
[ImageDataset] offset: [0.0284959 0.01992458 0.01470781], radius: 6.153295601864254
[ImageDataset] init dataset with 180 images
dataset: 180
[data/feicuiwan_sample_folder/sparse/0/sparse.npz] mean: -0.022, 2.020, 2.000
[data/feicuiwan_sample_folder/sparse/0/sparse.npz] std: 1.220, 1.035, 0.988
[data/feicuiwan_sample_folder/sparse/0/sparse.npz] sigma=1 60490/157773
bounds: [[-1.243, 0.985, 1.012], [1.198, 3.055, 2.988]]
[data/feicuiwan_sample_folder/sparse/0/sparse.npz] sigma=2 146807/157773
bounds: [[-2.463, -0.050, 0.024], [2.418, 4.090, 3.976]]
[data/feicuiwan_sample_folder/sparse/0/sparse.npz] sigma=3 157746/157773
bounds: [[-3.683, -1.085, -0.963], [3.638, 5.125, 4.964]]
[data/feicuiwan_sample_folder/sparse/0/sparse.npz] z_min: -1.674, z_max: 25.249
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 180/180 [00:05<00:00, 35.81it/s]
/usr/bin/ffmpeg -y -r 30 -i debug/%06d.jpg -vf scale="2*ceil(iw/2):2*ceil(ih/2)" -vcodec libx264 -r 30 debug.mp4 -loglevel quiet
The system cannot find the path specified.
It looks like the second issue is related to the path of ffmpeg. My code specifies /usr/bin/ffmpeg
, which is typically a path used on Linux systems. For Windows, you might need to adjust this path to where ffmpeg is installed on your machine.
Regarding the first issue, it appears to be related to using PyTorch dataloader with multiprocessing during training. Could you please confirm that your Windows environment is set up to handle PyTorch dataloader's multiprocessing correctly? This might involve checking your PyTorch and system configurations to ensure compatibility.
Additionally, you can also try adding num_workers 0
to the training command to disable multiprocessing, which might help.
I also received similar errors in Windows. As far as I can remember, these are the changes:
For the multiprocessing issue, you can edit the num_workers to 0 on this file: LoG\config\example\test\stage_8_4.yml (number of workers)
For ffmpeg, just remove the '/usr/bin/' part from the lines starting with cmd = f'/usr/bin/ffmpeg... (assuming ffmpeg is already added to the environment path). Do this for both files: LoG\dataset\image_base.py LoG\render\renderer.py
I used the full_images dataset and can confirm that it is working fine on Windows. Your results are very impressive and I am looking forward to your real-time rendering tool!
Your results are very impressive
Does it VR / OpenXR?
Currently, it generates images and an interpolated video. I am not sure whether it will support VR. I am hoping that the authors will soon provide a GUI, possibly implemented in EasyVolcap.
I couldn't find it in the README, so please let me know. Does this code work on Windows? Or does it need to be done on Linux?
When I run
python3 apps/train.py --cfg config/example/test/train.yml split train
I get the following error. I am investigating whether this is caused by the operating system, but I don't have Linux at hand right now, so this was just a prerequisite check.