yuedajiong / super-ai-vision-stereo-world-generate-triposr

20 stars 7 forks source link

Fine tune TSR #1

Open jd-rodriguezp1234 opened 6 months ago

jd-rodriguezp1234 commented 6 months ago

Hi, I've been testing your code in order to fine tune my own version of triposr with a custom dataset. However, as I try to set the model weights in serv.py in this code segment (lines 195 to 199), by uncommenting some lines:

    model = TSR(img_size=image_size, depth=16//2, embed_dim=768, num_channels=1024, num_layers=16//2, cross_attention_dim=768, radius=99, valid_thresh=0.00001, num_samples_per_ray=128, n_hidden_layers=9, official=False)
    #model.load_state_dict(torch.load('./ckpt/TripoSR/model.ckpt', map_location='cpu'))
    model.to(device)
    model.train()

I get the following error:

/data/2024-I/MACHINE_LEARNING_TECHNIQUES/project_v2/triposr/tripovenv/lib/python3.11/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/data/2024-I/MACHINE_LEARNING_TECHNIQUES/project_v2/triposr/tripovenv/lib/python3.11/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
parameters: 399 M
1
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/data/2024-I/MACHINE_LEARNING_TECHNIQUES/project_v2/triposr/super-ai-vision-stereo-world-generate-triposr/superv.py", line 275, in <module>
    main()
  File "/data/2024-I/MACHINE_LEARNING_TECHNIQUES/project_v2/triposr/super-ai-vision-stereo-world-generate-triposr/superv.py", line 272, in main
    train(image_size=[512,64][1], batch_size=1, epochs=1, checkpoint_path='./outs/ckpt/', best_checkpoint_file='./outs/ckpt/checkpoint.pth', device=device)
  File "/data/2024-I/MACHINE_LEARNING_TECHNIQUES/project_v2/triposr/super-ai-vision-stereo-world-generate-triposr/superv.py", line 243, in train
    image, alpha = model.renderer(model.decoder, scene_code, rays_o, rays_d)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/2024-I/MACHINE_LEARNING_TECHNIQUES/project_v2/triposr/tripovenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/2024-I/MACHINE_LEARNING_TECHNIQUES/project_v2/triposr/tripovenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/2024-I/MACHINE_LEARNING_TECHNIQUES/project_v2/triposr/super-ai-vision-stereo-world-generate-triposr/network_nerf_renderer.py", line 190, in forward
    xyz = (rays_o[:, None, :] + z_vals[..., None] * rays_d[..., None, :])  # (N_rays, N_sample, 3)
                                ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (0) must match the size of tensor b (4096) at non-singleton dimension 0

The training seems to work well with the lines commented as in the original file in the repository, in a 2080TI, and this is the only training code I've been able to run to generate 3D models from images. I would like to know if you have had a similar problem. Also, I had to download the file directly from huggingfaces since the wget command did not work.

Thanks in advance, Johan

yuedajiong commented 6 months ago

@jd-rodriguezp1234 Hi, bro:

  1. I am focusing on my 'single image + camere-pose free', so, this code is just for demo the 'train' logic.
    And I am using an older GPU, so, I often directly modify network by depth//N. This mini-train code can be run on my GPU server. It is completed and correct, just mini.
  2. I checked out and tried this code. (I have any plan to maintain this reserching project.) I remered that this code can be directly run. I tested again:
    image

    PLEASE run mesh->image to generate train dataset firstly. You can refer to my readme, step-3. all scripts are very very very simple and powerful. my code quality is higher than other reseaserchers. trust me.

my env:
0000-cuda11.7_pytorch1.13_tinycudann_colmap.txt 0000-cuda12.1_pytorch2.2.0.txt