Open Zhangjzh opened 1 year ago
Please firstly do visualization to check the input data (scans, SMPL, image, etc) are well aligned, see training.md.
# visualization for SMPL-X mesh
python -m lib.dataloader_demo -v -c ./configs/train/icon-filter.yaml
# visualization for voxelized SMPL
python -m lib.dataloader_demo -v -c ./configs/train/pamir.yaml
Ok, I will have a try. Thank you very much!
I find there are may some problems in the preprocess scripts. I use the scripts preprocessing the THuman2.0 dataset, but the preprocessed data cannot aligned to the smplx mesh. I need to download the SMPL+X.zip again to align the smplx mesh and the prepocessed data. I checked the smplx mesh has be changed after the preprocess. The problem others also meet, and this will lead to the failure of training. That's so strange! Can you share some suggestions about this? Very Thanks!
Hi @Zhangjzh and @Yuhuoo
I have corrected some bugs and updated the scripts for training data generation dataset.md.
Please re-download the SMPL-X.zip
, and re-run the data generation scripts:
conda activate icon
python -m scripts.render_batch -headless -out_dir data/
Now data/thuman2/smplx/xxxx.obj
and data/thuman2/scans/xxxx.obj
are aligned perfectly
Hi @Zhangjzh and @Yuhuoo
I have corrected some bugs and updated the scripts for training data generation dataset.md.
Please re-download the
SMPL-X.zip
, and re-run the data generation scripts:conda activate icon python -m scripts.render_batch -headless -out_dir data/
Now
data/thuman2/smplx/xxxx.obj
anddata/thuman2/scans/xxxx.obj
are aligned perfectly
Very thanks for your reply and resolutions. But I have meet the problems as follows using the scripts, I have tried many ways to solve the problems and I failed. Can you meet the problems?
The complete error message are as follows:
Start Rendering thuman2 with 36 views, 512x512 size.
Output dir: ./debug/thuman2_36views
Rendering types: ['light', 'normal', 'depth']
0%| | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/usr/local/miniconda3/envs/icon/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/miniconda3/envs/icon/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/hy-tmp/ICON/scripts/render_batch.py", line 224, in <module>
for _ in tqdm(
File "/usr/local/miniconda3/envs/icon/lib/python3.8/site-packages/tqdm/std.py", line 1180, in __iter__
for obj in iterable:
File "/usr/local/miniconda3/envs/icon/lib/python3.8/multiprocessing/pool.py", line 868, in next
raise value
multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x7f89f2e9f5b0>'. Reason: 'ValueError('ctypes objects containing pointers cannot be pickled')'
Very strange, this works well for me.
I updated it just now to resolve "numba" warnings, but I don't think it will solve your problem.
Hi @Zhangjzh and @Yuhuoo I have corrected some bugs and updated the scripts for training data generation dataset.md. Please re-download the
SMPL-X.zip
, and re-run the data generation scripts:conda activate icon python -m scripts.render_batch -headless -out_dir data/
Now
data/thuman2/smplx/xxxx.obj
anddata/thuman2/scans/xxxx.obj
are aligned perfectlyVery thanks for your reply and resolutions. But I have meet the problems as follows using the scripts, I have tried many ways to solve the problems and I failed. Can you meet the problems?
The complete error message are as follows:
Start Rendering thuman2 with 36 views, 512x512 size. Output dir: ./debug/thuman2_36views Rendering types: ['light', 'normal', 'depth'] 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/usr/local/miniconda3/envs/icon/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/miniconda3/envs/icon/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/hy-tmp/ICON/scripts/render_batch.py", line 224, in <module> for _ in tqdm( File "/usr/local/miniconda3/envs/icon/lib/python3.8/site-packages/tqdm/std.py", line 1180, in __iter__ for obj in iterable: File "/usr/local/miniconda3/envs/icon/lib/python3.8/multiprocessing/pool.py", line 868, in next raise value multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x7f89f2e9f5b0>'. Reason: 'ValueError('ctypes objects containing pointers cannot be pickled')'
Hi, I ran into the same problem as you do. Did you solve the problem?
you can try this: change for gpu_ids in range(NUMGPUS): for in range(PROC_PER_GPU): queue.put(gpu_ids ) to queue.put(0)
you can try this: change for gpu_ids in range(NUMGPUS): for in range(PROC_PER_GPU): queue.put(gpu_ids ) to queue.put(0)
My workstation contains two GPUs, if you are running on single-GPU machine, you could remove all these queue
lines.
Hi @Zhangjzh and @Yuhuoo
I have corrected some bugs and updated the scripts for training data generation dataset.md.
Please re-download the
SMPL-X.zip
, and re-run the data generation scripts:conda activate icon python -m scripts.render_batch -headless -out_dir data/
Now
data/thuman2/smplx/xxxx.obj
anddata/thuman2/scans/xxxx.obj
are aligned perfectly
您好,我遇到了楼主一样的问题(NC=0.2,P2S=4.),在我重新下载SMPL-X.zip并render_batch之后,检查了两个.obj是对齐的,但是用git clone repo最新版的ICON代码训练模型还是有问题,甚至比改动前结果指标差距更大了(P2S 40)。 在训练中,模型对cape的表现越来越差: 虽然指标比较正常,但是我在测试的时候将cape换成thuman2验证集的5个数据发现效果也不好: 由于我是服务器,无法用python -m lib.dataloader_demo -v -c ./configs/train/icon-filter.yaml来检验训练数据是否正确,请问可否有其他方法查看训练数据是否有问题?为什么同一个模型对cape和thuman2的结果差距很大?如何改正可以复现出论文中的训练精度呢? 很期待您的指教,谢谢
Hi @Zhangjzh and @Yuhuoo I have corrected some bugs and updated the scripts for training data generation dataset.md. Please re-download the
SMPL-X.zip
, and re-run the data generation scripts:conda activate icon python -m scripts.render_batch -headless -out_dir data/
Now
data/thuman2/smplx/xxxx.obj
anddata/thuman2/scans/xxxx.obj
are aligned perfectly您好,我遇到了楼主一样的问题(NC=0.2,P2S=4.),在我重新下载SMPL-X.zip并render_batch之后,检查了两个.obj是对齐的,但是用git clone repo最新版的ICON代码训练模型还是有问题,甚至比改动前结果指标差距更大了(P2S 40)。 在训练中,模型对cape的表现越来越差: 虽然指标比较正常,但是我在测试的时候将cape换成thuman2验证集的5个数据发现效果也不好: 由于我是服务器,无法用python -m lib.dataloader_demo -v -c ./configs/train/icon-filter.yaml来检验训练数据是否正确,请问可否有其他方法查看训练数据是否有问题?为什么同一个模型对cape和thuman2的结果差距很大?如何改正可以复现出论文中的训练精度呢? 很期待您的指教,谢谢
After re-downloading the SMPL-X.zip, I got well results. I didn't re-run the data generation scripts, because I ran into some trouble and I didn't figure out how to solve it. But when I ran python -m lib.dataloader_demo -v -c ./configs/train/icon-filter.yaml, I got well aligned results. Hope this will help you.
hi! thank you so much! Do you use the latest code in repo? I didn`t change the code and get wrong results, could you please share your train loss? is it the same like this in the first epoch?
SAME QUESTION
SAME QUESTION
after change trimesh version to 3.17.1
, the .obj under /thuman2/smplx/ is right
hello, excuse me, @YuliangXiu @yxt7979 download SMPL-x again, and change trimesh version to 3.17.1, but I also meet the same problem. data/thuman2/smplx/xxxx.obj and data/thuman2/scans/xxxx.obj are unaligned.
hello, excuse me, @YuliangXiu @yxt7979 download SMPL-x again, and change trimesh version to 3.17.1, but I also meet the same problem. data/thuman2/smplx/xxxx.obj and data/thuman2/scans/xxxx.obj are unaligned.
I find that's wrong when I run the previous code, but it's right when I run lastest code.
However, when I run lastest code, I meet the problem:
(icon) shengbo@user-SYS:~/ICON-master$ python -m scripts.render_batch -debug -headless
Start Rendering thuman2 with 36 views, 512x512 size.
Output dir: ./debug/thuman2_36views
Rendering types: ['light', 'normal', 'depth']
0%| | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/shengbo/anaconda3/envs/icon/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/shengbo/ICON-master/scripts/renderbatch.py", line 254, in
@Zhangjzh @Yuhuoo hello, have you solved this problem? Thank you!
@gushengbo did you find any solution?
@Zhangjzh @Yuhuoo @AndrewMorgan2 hello, how did you download the smpl-x.zip, now I found the link is 404 not found.
Please make a new issue if you have a new question
Hi @Zhangjzh and @Yuhuoo
I have corrected some bugs and updated the scripts for training data generation dataset.md.
Please re-download the
SMPL-X.zip
, and re-run the data generation scripts:conda activate icon python -m scripts.render_batch -headless -out_dir data/
Now
data/thuman2/smplx/xxxx.obj
anddata/thuman2/scans/xxxx.obj
are aligned perfectly
Hi, I'd like to ask you some questions about this. You just recommended re-running the data generation script:
conda activate icon
python -m scripts.render_batch -headless -out_dir data/
Now the data/thuman2 / SMPLX/XXXX. Obj and data/thuman2 / scans/XXXX. Obj are aligned perfectly . But I wonder if I need to re-run the data generation script as following?
python -m scripts.visibility_batch -out_dir data/
Because I tried to debug a model and found that the data/thuman2_36views/scans/xxxx/vis/ xxxx.obj changed. Although the difference did not significantly affect the visualization results of vedo, I wonder if not re-running the second script will have any effect on the training?
Hi @Zhangjzh and @Yuhuoo I have corrected some bugs and updated the scripts for training data generation dataset.md. Please re-download the
SMPL-X.zip
, and re-run the data generation scripts:conda activate icon python -m scripts.render_batch -headless -out_dir data/
Now
data/thuman2/smplx/xxxx.obj
anddata/thuman2/scans/xxxx.obj
are aligned perfectlyHi, I'd like to ask you some questions about this. You just recommended re-running the data generation script:
conda activate icon python -m scripts.render_batch -headless -out_dir data/
Now the data/thuman2 / SMPLX/XXXX. Obj and data/thuman2 / scans/XXXX. Obj are aligned perfectly . But I wonder if I need to re-run the data generation script as following?
python -m scripts.visibility_batch -out_dir data/
Because I tried to debug a model and found that the data/thuman2_36views/scans/xxxx/vis/ xxxx.obj changed. Although the difference did not significantly affect the visualization results of vedo, I wonder if not re-running the second script will have any effect on the training?
Yes, you need to re-run the visibility computation since the visibility is computed on SMPL-X objs
After training of the implicit MLP, I got quite wired results. The reconstructed meshes are poor. The evaluation results shows that NC is very low, but chamfer and p2s are very high. Do you know where the problem is? I would appreciate it a lot if you could give me some suggestions!