Open shankar-anantak opened 5 months ago
This issue arises from the validation settings. You can resolve it by deleting the val:
and its related properties in the config/example/test/train.yml
file.
Thank you for the response. I followed your reccomendation, deleting `val:' and its properties from the train.yml config:
parents:
- config/example/test/dataset.yml
- config/example/test/level_of_gaussian.yml
- config/example/test/stage_8_4.yml
exp: output/example/test/log
gpus: [0]
log_interval: 1000
save_interval: 10_000
max_steps: 750
RGB_RENDER_L1_SSIM:
module: LoG.render.renderer.NaiveRendererAndLoss
args:
use_origin_render: False
use_randback: True
train:
dataset: $dataset
render: $RGB_RENDER_L1_SSIM
stages: $NAIVE_STAGE
init:
method: scale_min
dataset_state:
scale: 4
However, i still get the same error:
(LoG) dev@instance-20240430-202938:~/dev/LoG$ python3 apps/train.py --cfg config/example/test/train.yml split train
[Config] merge from parent file: config/example/test/dataset.yml
[Config] merge from parent file: config/example/test/level_of_gaussian.yml
[Config] merge from parent file: config/example/test/stage_8_4.yml
Key is not in the template: split
[Config] replace key $root
[Config] replace key $scale3d
[Config] replace key $PLYNAME
[Config] replace key $PLYNAME
[Config] replace key $PLYNAME
[Config] replace key $xyz_scale
[Config] replace key $scale3d
[Config] replace key $PLYNAME
[Config] replace key $xyz_scale
[Config] replace key $max_steps
[Config] replace key $dataset
[Config] replace key $RGB_RENDER_L1_SSIM
[Config] replace key $NAIVE_STAGE
Using GPUs: 0
Write to output/example/test/log
[/home/dev/dev/data/2023-12-18_15.06.36/sparse/0/sparse.npz] mean: -0.276, 0.500, -1.018
[/home/dev/dev/data/2023-12-18_15.06.36/sparse/0/sparse.npz] std: 1.480, 1.085, 3.215
[/home/dev/dev/data/2023-12-18_15.06.36/sparse/0/sparse.npz] sigma=1 49864/125443
bounds: [[-1.756, -0.584, -4.233], [1.204, 1.585, 2.197]]
[/home/dev/dev/data/2023-12-18_15.06.36/sparse/0/sparse.npz] sigma=2 114402/125443
bounds: [[-3.236, -1.669, -7.449], [2.683, 2.670, 5.412]]
[/home/dev/dev/data/2023-12-18_15.06.36/sparse/0/sparse.npz] sigma=3 122545/125443
bounds: [[-4.715, -2.754, -10.664], [4.163, 3.755, 8.627]]
[/home/dev/dev/data/2023-12-18_15.06.36/sparse/0/sparse.npz] z_min: -53.713, z_max: 23.659
[Load PLY] load from ply: /home/dev/dev/data/2023-12-18_15.06.36/sparse/0/sparse.npz
[Load PLY] min: [-10.02181557 -10.31714249 -53.71318683], max: [18.95321076 24.50246681 23.65868778]
[Load PLY] scale: 0.0003, 22.6722, mean = 0.0294
[GaussianPoint] scales: [0.0003~0.0294~22.6722]
[GaussianPoint] -> scales: [0.0074~0.0255~0.1176]
>>> Code 3062 files has been copied to output/example/test/log/code_backup_20240503-161612
[ImageDataset] set scales: [1, 2, 4, 8], crop size: [-1, -1]
[ImageDataset] cache dir: /home/dev/dev/data/2023-12-18_15.06.36/cache
Traceback (most recent call last):
File "/home/dev/dev/LoG/apps/train.py", line 157, in <module>
main()
File "/home/dev/dev/LoG/apps/train.py", line 130, in main
dataset = load_object(cfg.train.dataset.module, cfg.train.dataset.args)
File "/home/dev/dev/LoG/LoG/utils/config.py", line 61, in load_object
obj = getattr(module, name)(**extra_args, **module_args)
File "/home/dev/dev/LoG/LoG/dataset/colmap.py", line 159, in __init__
centers = np.stack([-info['camera']['R'].T @ info['camera']['T'] for info in infos], axis=0)
File "/home/dev/miniconda3/envs/LoG/lib/python3.10/site-packages/numpy/core/shape_base.py", line 445, in stack
raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack
It seems that the dataset can't find any camera parameters. Can you show you folder structure?
Sure,
(LoG) dev@instance-20240430-202938:~/dev/LoG$ ls
LoG LoG.egg-info README.md apps assets config docs output requirements.txt setup.py submodules
(LoG) dev@instance-20240430-202938:~/dev/LoG$ ls ../data/2023-12-18_15.06.36/sparse/0/
cameras.bin extri.yml images.bin intri.yml points3D.bin project.ini sparse.npz sparse.ply
(LoG) dev@instance-20240430-202938:~/dev/LoG$
My LoG root: /home/dev/dev/LoG My COLMAP output: /home/dev/dev/data/2023-12-18_15.06.36/sparse/0
Paths in the dataset.yml:
root: /home/dev/dev/data/2023-12-18_15.06.36 PLYNAME: /home/dev/dev/data/2023-12-18_15.06.36/sparse/0/sparse.npz
Your help is greatly appreciated
You can try to remove data/2023-12-18_15.06.36/cache
and data/2023-12-18_15.06.36/cache.pkl
and retry.
After removing cache, here is the output:
python3 apps/train.py --cfg config/example/test/train.yml split train
[Config] merge from parent file: config/example/test/dataset.yml
[Config] merge from parent file: config/example/test/level_of_gaussian.yml
[Config] merge from parent file: config/example/test/stage_8_4.yml
Key is not in the template: split
[Config] replace key $root
[Config] replace key $scale3d
[Config] replace key $PLYNAME
[Config] replace key $PLYNAME
[Config] replace key $PLYNAME
[Config] replace key $xyz_scale
[Config] replace key $scale3d
[Config] replace key $PLYNAME
[Config] replace key $xyz_scale
[Config] replace key $max_steps
[Config] replace key $dataset
[Config] replace key $RGB_RENDER_L1_SSIM
[Config] replace key $NAIVE_STAGE
Using GPUs: 0
Write to output/example/test/log
[/home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0/sparse.npz] mean: -0.276, 0.500, -1.018
[/home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0/sparse.npz] std: 1.480, 1.085, 3.215
[/home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0/sparse.npz] sigma=1 49864/125443
bounds: [[-1.756, -0.584, -4.233], [1.204, 1.585, 2.197]]
[/home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0/sparse.npz] sigma=2 114402/125443
bounds: [[-3.236, -1.669, -7.449], [2.683, 2.670, 5.412]]
[/home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0/sparse.npz] sigma=3 122545/125443
bounds: [[-4.715, -2.754, -10.664], [4.163, 3.755, 8.627]]
[/home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0/sparse.npz] z_min: -53.713, z_max: 23.659
[Load PLY] load from ply: /home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0/sparse.npz
[Load PLY] min: [-10.02181557 -10.31714249 -53.71318683], max: [18.95321076 24.50246681 23.65868778]
[Load PLY] scale: 0.0003, 22.6722, mean = 0.0294
[GaussianPoint] scales: [0.0003~0.0294~22.6722]
[GaussianPoint] -> scales: [0.0074~0.0255~0.1176]
>>> Code 3062 files has been copied to output/example/test/log/code_backup_20240503-165112
[ImageDataset] set scales: [1, 2, 4, 8], crop size: [-1, -1]
[ImageDataset] cache dir: /home/dev/dev/LoG/data/2023-12-18_15.06.36/cache
Loaded 677 cameras from /home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0
scale3d = 1.0
[ImageDataset] init camera out-1
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-1.JPG
[ImageDataset] init camera out-10
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-10.JPG
[ImageDataset] init camera out-100
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-100.JPG
[ImageDataset] init camera out-101
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-101.JPG
[ImageDataset] init camera out-102
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-102.JPG
[ImageDataset] init camera out-103
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-103.JPG
[ImageDataset] init camera out-104
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-104.JPG
[ImageDataset] init camera out-105
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-105.JPG
[ImageDataset] init camera out-106
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-106.JPG
[ImageDataset] init camera out-107
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-107.JPG
[ImageDataset] init camera out-108
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-108.JPG
[ImageDataset] init camera out-109
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-109.JPG
[ImageDataset] init camera out-11
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-11.JPG
...
...
...
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-228.JPG
[ImageDataset] init camera out-229
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-229.JPG
[ImageDataset] init camera out-23
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-23.JPG
[ImageDataset] init camera out-230
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-230.JPG
[ImageDataset] init camera out-231
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-231.JPG
[ImageDataset] init camera out-232
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-232.JPG
[ImageDataset] init camera out-233
Not exists: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/out-233.JPG
[ImageDataset] init camera out-234
^CTraceback (most recent call last):
File "/home/dev/dev/LoG/apps/train.py", line 157, in <module>
main()
File "/home/dev/dev/LoG/apps/train.py", line 130, in main
dataset = load_object(cfg.train.dataset.module, cfg.train.dataset.args)
File "/home/dev/dev/LoG/LoG/utils/config.py", line 61, in load_object
obj = getattr(module, name)(**extra_args, **module_args)
File "/home/dev/dev/LoG/LoG/dataset/colmap.py", line 135, in __init__
camera = self.check_undis_camera(camname, cameras_cache, camera_dis, share_camera)
File "/home/dev/dev/LoG/LoG/dataset/colmap.py", line 88, in check_undis_camera
cameras_cache[cache_camname] = self.init_camera(camera_undis)
File "/home/dev/dev/LoG/LoG/dataset/colmap.py", line 74, in init_camera
mapx, mapy = cv2.initUndistortRectifyMap(camera['K'], camera['dist'], None, newK, (width, height), 5)
KeyboardInterrupt
Unsure how camera init is failing here,
[Load PLY] load from ply: /home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0/sparse.npz [Load PLY] min: [-10.02181557 -10.31714249 -53.71318683], max: [18.95321076 24.50246681 23.65868778] [Load PLY] scale: 0.0003, 22.6722, mean = 0.0294 [GaussianPoint] scales: [0.0003~0.0294~22.6722] [GaussianPoint] -> scales: [0.0074~0.0255~0.1176]
Code 3062 files has been copied to output/example/test/log/code_backup_20240503-165112 [ImageDataset] set scales: [1, 2, 4, 8], crop size: [-1, -1] [ImageDataset] cache dir: /home/dev/dev/LoG/data/2023-12-18_15.06.36/cache Loaded 677 cameras from /home/dev/dev/LoG/data/2023-12-18_15.06.36/sparse/0
It appears we are able to find the other paths.
I confirmed all images are in here: /home/dev/dev/LoG/data/2023-12-18_15.06.36/images/..
(LoG) dev@instance-20240430-202938:~/dev/LoG$ ls data/2023-12-18_15.06.36/images
out-1.jpg out-140.jpg out-182.jpg out-223.jpg out-265.jpg out-306.jpg out-348.jpg out-39.jpg out-430.jpg out-472.jpg out-513.jpg out-555.jpg out-597.jpg out-638.jpg out-68.jpg
out-10.jpg out-141.jpg out-183.jpg out-224.jpg out-266.jpg out-307.jpg out-349.jpg out-390.jpg out-431.jpg out-473.jpg out-514.jpg out-556.jpg out-598.jpg out-639.jpg out-680.jpg
out-100.jpg out-142.jpg out-184.jpg out-225.jpg out-267.jpg out-308.jpg out-35.jpg out-391.jpg out-432.jpg out-474.jpg out-515.jpg out-557.jpg out-599.jpg out-64.jpg out-681.jpg
out-101.jpg out-143.jpg out-185.jpg out-226.jpg out-268.jpg out-309.jpg out-350.jpg out-392.jpg out-433.jpg out-475.jpg out-516.jpg out-558.jpg out-6.jpg out-640.jpg out-69.jpg
out-102.jpg out-144.jpg out-186.jpg out-227.jpg out-269.jpg out-31.jpg out-351.jpg out-393.jpg out-434.jpg out-476.jpg out-517.jpg out-559.jpg out-60.jpg out-641.jpg out-7.jpg
Again, your assistance is greatly appreciated!
Hello, your image extension is .jpg instead of .JPG. You should modify this in dataset.yml
Thank you for helping me resolve that silly mistake. Unfortunately, I have a new issue:
write cache to /home/dev/dev/LoG/data/2023-12-18_15.06.36/cache.pkl
[ImageDataset] offset: [ 0.00241288 0.02408803 -0.06257018], radius: 6.018952590965555
[ImageDataset] init dataset with 677 images
[ImageDataset] set scale 4, crop_size: [-1, -1], downsample_scale: 1
initialize the model: 100%|██████████████████████████████████████████████████████████████████████████████| 677/677 [00:01<00:00, 557.18it/s]
[LoG] minimum scales: [0.0004~0.0027~0.0818]
Traceback (most recent call last):
File "/home/dev/dev/LoG/apps/train.py", line 157, in <module>
main()
File "/home/dev/dev/LoG/apps/train.py", line 139, in main
trainer.init(dataset)
File "/home/dev/dev/LoG/LoG/utils/trainer.py", line 177, in init
self.model.at_init_final()
File "/home/dev/dev/LoG/LoG/model/level_of_gaussian.py", line 326, in at_init_final
self.counter.radius3d_max.fill_(self.gaussian.xyz_scale * 0.2)
TypeError: can't multiply sequence by non-int of type 'float'
Again, your assistance is truly appreciated
You can check self.gaussian.xyz_scale
in this line.
@chingswy i use a dataset from internet, this my dataset: 链接:https://pan.baidu.com/s/1PMnA-ibSCqNEmvuCb05cPA 提取码:2ttz --来自百度网盘超级会员V6的分享 i change the dataset dir and delete args val in train.yaml and delete the pkl file and cache fold before runing. then training error:
[ImageDataset] undistort and scale 107 images
100%|████████████████████████████████████████████████████████████████████████████████| 107/107 [00:35<00:00, 2.99it/s]
write cache to D:\GS_Pro\LoG\data\boli\cache.pkl
[ImageDataset] offset: [3.07092454e-02 2.35425679e-05 2.92656670e-02], radius: 5.4546816443016715
[ImageDataset] init dataset with 107 images
Base iteration: 200
[ImageDataset] set scale 1, crop_size: [-1, -1], downsample_scale: 1
initialize the model: 100%|██████████████████████████████████████████████████████████| 107/107 [00:17<00:00, 6.20it/s]
[LoG] minimum scales: [0.0002~0.0006~0.0044]
[Corrector] init view correction: 107
[ImageDataset] set partial indices 107
quick view: 10%|███████ | 11/107 [00:06<00:59, 1.63it/s]
[ImageDataset] set partial indices 107
> Run stage: init. 30000 iterations
[ImageDataset] set scale 8, crop_size: [-1, -1], downsample_scale: 1
[SparseOptimizer] xyz_scale: 1.0, steps: 150000, lr 0.00016->1.6e-06
[SparseOptimizer] scaling: 0.005 -> 0.005
[LoG] optimizer setup: max steps = 150000
[Counter] reset counter -> 96007
[Corrector] view correction optimizer setup 0.001
Traceback (most recent call last):
File "D:\GS_Pro\LoG\apps\train.py", line 169, in <module>
main()
File "D:\GS_Pro\LoG\apps\train.py", line 166, in main
trainer.fit(dataset)
File "d:\gs_pro\log\LoG\utils\trainer.py", line 487, in fit
for iteration, data in enumerate(trainloader):
File "G:\miniconda3\envs\log\lib\site-packages\torch\utils\data\dataloader.py", line 439, in __iter__
return self._get_iterator()
File "G:\miniconda3\envs\log\lib\site-packages\torch\utils\data\dataloader.py", line 387, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "G:\miniconda3\envs\log\lib\site-packages\torch\utils\data\dataloader.py", line 1040, in __init__
w.start()
File "G:\miniconda3\envs\log\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "G:\miniconda3\envs\log\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "G:\miniconda3\envs\log\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "G:\miniconda3\envs\log\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
reduction.dump(process_obj, to_child)
File "G:\miniconda3\envs\log\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'Trainer.train_loader.<locals>.worker_init_fn'
(log) D:\GS_Pro\LoG>Traceback (most recent call last):
File "<string>", line 1, in <module>
File "G:\miniconda3\envs\log\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "G:\miniconda3\envs\log\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
i looks like the dataset loads wrong.
Hello, I am able to load your data and train normally on my computer, but I only have a Linux testing environment. Your issue might be related to Windows or a specific problem with PyTorch.
@chingswy i use a dataset from internet, this my dataset: 链接:https://pan.baidu.com/s/1PMnA-ibSCqNEmvuCb05cPA 提取码:2ttz --来自百度网盘超级会员V6的分享 i change the dataset dir and delete args val in train.yaml and delete the pkl file and cache fold before runing. then training error:
[ImageDataset] undistort and scale 107 images 100%|████████████████████████████████████████████████████████████████████████████████| 107/107 [00:35<00:00, 2.99it/s] write cache to D:\GS_Pro\LoG\data\boli\cache.pkl [ImageDataset] offset: [3.07092454e-02 2.35425679e-05 2.92656670e-02], radius: 5.4546816443016715 [ImageDataset] init dataset with 107 images Base iteration: 200 [ImageDataset] set scale 1, crop_size: [-1, -1], downsample_scale: 1 initialize the model: 100%|██████████████████████████████████████████████████████████| 107/107 [00:17<00:00, 6.20it/s] [LoG] minimum scales: [0.0002~0.0006~0.0044] [Corrector] init view correction: 107 [ImageDataset] set partial indices 107 quick view: 10%|███████ | 11/107 [00:06<00:59, 1.63it/s] [ImageDataset] set partial indices 107 > Run stage: init. 30000 iterations [ImageDataset] set scale 8, crop_size: [-1, -1], downsample_scale: 1 [SparseOptimizer] xyz_scale: 1.0, steps: 150000, lr 0.00016->1.6e-06 [SparseOptimizer] scaling: 0.005 -> 0.005 [LoG] optimizer setup: max steps = 150000 [Counter] reset counter -> 96007 [Corrector] view correction optimizer setup 0.001 Traceback (most recent call last): File "D:\GS_Pro\LoG\apps\train.py", line 169, in <module> main() File "D:\GS_Pro\LoG\apps\train.py", line 166, in main trainer.fit(dataset) File "d:\gs_pro\log\LoG\utils\trainer.py", line 487, in fit for iteration, data in enumerate(trainloader): File "G:\miniconda3\envs\log\lib\site-packages\torch\utils\data\dataloader.py", line 439, in __iter__ return self._get_iterator() File "G:\miniconda3\envs\log\lib\site-packages\torch\utils\data\dataloader.py", line 387, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "G:\miniconda3\envs\log\lib\site-packages\torch\utils\data\dataloader.py", line 1040, in __init__ w.start() File "G:\miniconda3\envs\log\lib\multiprocessing\process.py", line 121, in start self._popen = self._Popen(self) File "G:\miniconda3\envs\log\lib\multiprocessing\context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "G:\miniconda3\envs\log\lib\multiprocessing\context.py", line 327, in _Popen return Popen(process_obj) File "G:\miniconda3\envs\log\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__ reduction.dump(process_obj, to_child) File "G:\miniconda3\envs\log\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'Trainer.train_loader.<locals>.worker_init_fn' (log) D:\GS_Pro\LoG>Traceback (most recent call last): File "<string>", line 1, in <module> File "G:\miniconda3\envs\log\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "G:\miniconda3\envs\log\lib\multiprocessing\spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input
i looks like the dataset loads wrong.
There are a lot of places in the code where they specify the amount of cpu workers (num_workers). If you set all of them to 1, it should work on windows.
Hello,
I am trying to train the model on my custom data captured from a phone camera. Pre-processed fine w/ COLMAP, however, when i try to train the model:
I modified the dataset path in config/example/test/dataset.yml to match my input data path.
Are there further steps required for custom data? Your advice would be appreciated.
Thanks