River-Zhang / SIFU

[CVPR 2024 Highlight] Official repository for paper "SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction"
https://river-zhang.github.io/SIFU-projectpage/
MIT License
206 stars 9 forks source link

About train #31

Open Zhangpei226 opened 4 months ago

Zhangpei226 commented 4 months ago

(sifu3) pp@ys:~/SIFU$ python -m apps.train -cfg ./configs/train/sifu.yaml load from ./data/cape/train.txt total: 152 load from ./data/cape/val.txt total: 36 ICON: w/ Global Image Encoder: True Image Features used by MLP: ['normal_F', 'normal_B'] Geometry Features used by MLP: ['sdf', 'cmap', 'norm', 'vis', 'sample_id'] Dim of Image Features (local): 6 Dim of Geometry Features (ICON): 7 Dim of MLP's first layer: 78

GPU available: True, used: True TPU available: None, using: 0 TPU cores Resume MLP weights from ./data/ckpt/sifu.ckpt Resume normal model from ./data/ckpt/normal.ckpt LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

| Name | Type | Params

0 | netG | HGPIFuNet | 413 M 1 | reconEngine | Seg3dLossless | 0

411 M Trainable params 1.3 M Non-trainable params 413 M Total params 1,652.498 Total estimated model params size (MB) Validation sanity check: 0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last): File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/pp/SIFU/apps/train.py", line 154, in trainer.fit(model=model, datamodule=datamodule) File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 499, in fit self.dispatch() File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 546, in dispatch self.accelerator.start_training(self) File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 73, in start_training self.training_type_plugin.start_training(trainer) File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 114, in start_training self._results = trainer.run_train() File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 607, in run_train self.run_sanity_check(self.lightning_module) File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 856, in run_sanitycheck , eval_results = self.run_evaluation(max_batches=self.num_sanity_val_batches) File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 712, in run_evaluation for batch_idx, batch in enumerate(dataloader): File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 628, in next data = self._next_data() File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1333, in _next_data return self._process_data(data) File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1359, in _process_data data.reraise() File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/_utils.py", line 543, in reraise raise exception IndexError: Caught IndexError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/pp/miniconda3/envs/sifu3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 58, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/pp/SIFU/lib/dataset/PIFuDataset.py", line 217, in getitem subject = self.subject_list[mid].split("/")[1] IndexError: list index out of range

Zhangpei226 commented 4 months ago
8

I found that my train.txt only has 150 entries, and val.txt only has one entry, but it shows 152 and 36. Is this the problem?

zjh21 commented 1 month ago

I met a similar problem previously. I think it might be your ./data/cape/train.txt and ./data/cape/val.txt files. The lines in it should be like "cape/xxxx" rather than "xxxx" alone. Otherwise, there is "xxxx" alone and dataset and subject cannot be split by "/" as in the 217 line of PIFuDataset.py.