Chrixtar / latentsplat

[ECCV 2024] Implementation of latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction
https://geometric-rl.mpi-inf.mpg.de/latentsplat/
MIT License
146 stars 3 forks source link

how can I train locally #2

Closed BroenLin closed 5 months ago

BroenLin commented 5 months ago

I am trying to run the code without internet connection because I always failed to reach wandb. I already add "os.environ["WANDB_API_KEY"] = YOUR_KEY_HERE os.environ["WANDB_MODE"] = "offline" in src/main.py, yet it didn't work because it still seemed to try to get internet. My command is "python3 -m src.main +experiment=co3d_hydrant mode=test dataset/view_sampler=evaluation dataset.view_sampler.index_path=assets/evaluation_index/co3d_hydrant_extra.json checkpointing.load=checkpoints/co3d_hydrant.ckpt " My error is as follows:

wandb: Network error (ConnectTimeout), entering retry loop. Problem at: /home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/pytorch_lightning/loggers/wandb.py 400 experiment Error executing job with overrides: ['+experiment=co3d_hydrant', 'mode=test', 'dataset/view_sampler=evaluation', 'dataset.view_sampler.index_path=assets/evaluation_index/co3d_hydrant_extra.json', 'checkpointing.load=checkpoints/co3d_hydrant_kl.ckpt'] wandb: ERROR Run initialization has timed out after 90.0 sec. wandb: ERROR Please refer to the documentation for additional information: https://docs.wandb.ai/guides/track/tracking-faq#initstarterror-error-communicating-with-wandb-process- Traceback (most recent call last): File "/home/llq/.conda/envs/latentsplat/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/llq/.conda/envs/latentsplat/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/hdd2/llq202312/latentsplat-main/src/main.py", line 159, in train() File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/hydra/main.py", line 94, in decorated_main _run_hydra( File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra _run_app( File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/hydra/_internal/utils.py", line 457, in _run_app run_and_report( File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/hydra/_internal/utils.py", line 223, in run_and_report raise ex File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/hydra/_internal/utils.py", line 220, in run_and_report return func() File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/hydra/_internal/utils.py", line 458, in lambda: hydra.run( File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/hydra/internal/hydra.py", line 132, in run = ret.return_value File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/hydra/core/utils.py", line 260, in return_value raise self._return_value File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "/hdd2/llq202312/latentsplat-main/src/main.py", line 153, in train trainer.test(model_wrapper, datamodule=data_module, ckpt_path=checkpoint_path) File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 753, in test return call._call_and_handle_interrupt( File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, *kwargs) File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 105, in launch return function(args, kwargs) File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 793, in _test_impl results = self._run(model, ckpt_path=ckpt_path) File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 948, in _run call._call_setup_hook(self) # allow user to set up LightningModule in accelerator environment File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 86, in _call_setup_hook if hasattr(logger, "experiment"): File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/lightning_fabric/loggers/logger.py", line 118, in experiment return fn(self) File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/pytorch_lightning/loggers/wandb.py", line 400, in experiment self._experiment = wandb.init(self._wandb_init) File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 1195, in init raise e File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 1176, in init run = wi.init() File "/home/llq/.conda/envs/latentsplat/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 785, in init raise error wandb.errors.CommError: Run initialization has timed out after 90.0 sec. Please refer to the documentation for additional information: https://docs.wandb.ai/guides/track/tracking-faq#initstarterror-error-communicating-with-wandb-process-

Chrixtar commented 5 months ago

Hi BroenLin,

thank you very much for your interest in our work. Could you try setting: wandb: ... activated: false in config/main.yaml.

Please let me know, whether this resolves the problem.

Best Chris

BroenLin commented 5 months ago

Hi BroenLin,

thank you very much for your interest in our work. Could you try setting: wandb: ... activated: false in config/main.yaml.

Please let me know, whether this resolves the problem.

Best Chris

Thanks a lot, Chrixtar! It worked.