lucidrains / meshgpt-pytorch

Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
MIT License
700 stars 57 forks source link

TypeError: MessagePassing.__init__() got an unexpected keyword argument 'sageconv_dropout' #65

Closed StephenYangjz closed 5 months ago

StephenYangjz commented 6 months ago

Hiii thanks for the work but it would be helpful if by any chance you can offer some pointers on this error. Below is the full error message@lucidrains :

TypeError                                 Traceback (most recent call last)
Cell In[5], [line 10](vscode-notebook-cell:?execution_count=5&line=10)
      [3](vscode-notebook-cell:?execution_count=5&line=3) from meshgpt_pytorch import (
      [4](vscode-notebook-cell:?execution_count=5&line=4)     MeshAutoencoder,
      [5](vscode-notebook-cell:?execution_count=5&line=5)     MeshTransformer
      [6](vscode-notebook-cell:?execution_count=5&line=6) )
      [8](vscode-notebook-cell:?execution_count=5&line=8) # autoencoder
---> [10](vscode-notebook-cell:?execution_count=5&line=10) autoencoder = MeshAutoencoder(
     [11](vscode-notebook-cell:?execution_count=5&line=11)     num_discrete_coors = 64
     [12](vscode-notebook-cell:?execution_count=5&line=12) )
     [14](vscode-notebook-cell:?execution_count=5&line=14) # mock inputs
     [16](vscode-notebook-cell:?execution_count=5&line=16) vertices = torch.randn((2, 121, 3))            # (batch, num vertices, coor (3))

File [~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/pytorch_custom_utils/save_load.py:35](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/pytorch_custom_utils/save_load.py:35), in save_load.<locals>._save_load.<locals>.__init__(self, *args, **kwargs)
     [32](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/pytorch_custom_utils/save_load.py:32) _config = pickle.dumps((args, kwargs))
     [34](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/pytorch_custom_utils/save_load.py:34) setattr(self, config_instance_var_name, _config)
---> [35](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/pytorch_custom_utils/save_load.py:35) _orig_init(self, *args, **kwargs)

File <@beartype(meshgpt_pytorch.meshgpt_pytorch.MeshAutoencoder.__init__) at 0x7f3c27c8b550>:212, in __init__(__beartype_get_violation, __beartype_conf, __beartype_getrandbits, __beartype_func, *args, **kwargs)

File [~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/meshgpt_pytorch.py:492](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/meshgpt_pytorch.py:492), in MeshAutoencoder.__init__(self, num_discrete_coors, coor_continuous_range, dim_coor_embed, num_discrete_area, dim_area_embed, num_discrete_normals, dim_normal_embed, num_discrete_angle, dim_angle_embed, encoder_dims_through_depth, init_decoder_conv_kernel, decoder_dims_through_depth, dim_codebook, num_quantizers, codebook_size, use_residual_lfq, rq_kwargs, rvq_kwargs, rlfq_kwargs, rvq_stochastic_sample_codes, sageconv_kwargs, commit_loss_weight, bin_smooth_blur_sigma, attn_encoder_depth, attn_decoder_depth, local_attn_kwargs, local_attn_window_size, linear_attn_kwargs, use_linear_attn, pad_id, flash_attn, sageconv_dropout, attn_dropout, ff_dropout, resnet_dropout, checkpoint_quantizer, quads)
    [489](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/meshgpt_pytorch.py:489) init_encoder_dim, *encoder_dims_through_depth = encoder_dims_through_depth
    [490](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/meshgpt_pytorch.py:490) curr_dim = init_encoder_dim
--> [492](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/meshgpt_pytorch.py:492) self.init_sage_conv = SAGEConv(dim_codebook, init_encoder_dim, **sageconv_kwargs)
...
---> [91](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/torch_geometric/nn/conv/sage_conv.py:91) super().__init__(aggr, **kwargs)
     [93](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/torch_geometric/nn/conv/sage_conv.py:93) if self.project:
     [94](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/torch_geometric/nn/conv/sage_conv.py:94)     if in_channels[0] <= 0:

TypeError: __init__() got an unexpected keyword argument 'sageconv_dropout'
DreamAddiction commented 6 months ago

I also have the same issue... Googling doesn't seem to offer any chance on this error

MarcusLoppe commented 6 months ago

I think that they dropped the support for dropout for sageconv, I just removed the args and it resolved it

@lucidrains

e.g: from: sageconv_kwargs = {**sageconv_kwargs, 'sageconv_dropout' : sageconv_dropout} to: sageconv_kwargs = {**sageconv_kwargs }

StephenYangjz commented 6 months ago

Hey @MarcusLoppe thanks -- the default is 0 anyway so i guess we can do that

StephenYangjz commented 5 months ago

Another error im getting is:

File [~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:106](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:106), in <listcomp>(.0)
     [98](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:98)     batch_codes = autoencoder.tokenize(
     [99](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:99)         vertices=padded_batch_vertices,
    [100](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:100)         faces=padded_batch_faces,
    [101](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:101)         face_edges=padded_batch_face_edges
    [102](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:102)     )
...
--> [106](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:106)     item['codes'] = [code for code in codes if code != autoencoder.pad_id and code != -1]
    [108](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:108) self.sort_dataset_keys()
    [109](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:109) print(f"[MeshDataset] Generated codes for {len(self.data)} entrys")

Is this still specific issues on my end as well : (?

Another idea that I am thinking of is to generate meshes w/ bounding boxes as input text token parameters. Would you think this is possible given the setup we have if we insert that info into the training loop? @MarcusLoppe Thank you so much for all the help!

MarcusLoppe commented 5 months ago

Another error im getting is:

File [~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:106](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:106), in <listcomp>(.0)
     [98](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:98)     batch_codes = autoencoder.tokenize(
     [99](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:99)         vertices=padded_batch_vertices,
    [100](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:100)         faces=padded_batch_faces,
    [101](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:101)         face_edges=padded_batch_face_edges
    [102](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:102)     )
...
--> [106](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:106)     item['codes'] = [code for code in codes if code != autoencoder.pad_id and code != -1]
    [108](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:108) self.sort_dataset_keys()
    [109](https://vscode-remote+ssh-002dremote-002bitx.vscode-resource.vscode-cdn.net/home/stephen/Desktop/meshgpt-pytorch/~/anaconda3/envs/meshgpt/lib/python3.9/site-packages/meshgpt_pytorch/mesh_dataset.py:109) print(f"[MeshDataset] Generated codes for {len(self.data)} entrys")

Is this still specific issues on my end as well : (?

Ah no sorry! I always used to generate tokens using 1 item per batch to avoid VRAM issues. But since doing that for 218k items takes a while I implemented batch processing. I made a few mistakes but the latest commit should have resolved that issue. Sorry about that, I was doing the changes while half asleep :)

Another idea that I am thinking of is to generate meshes w/ bounding boxes as input text token parameters. Would you think this is possible given the setup we have if we insert that info into the training loop? @MarcusLoppe Thank you so much for all the help!

You mean providing it with more details about the wanted mesh model size?

The autoencoder converts the mesh and 'simplifies'/discretize it so the lowest point in a axis is 0 and max is 127. This way all the inputs are the same sizes since they are the same dimensions and it lets the autoencoder learn in a more uniform way. You can increase this value (num_discrete_coors ) to higher if you are dealing with very big meshes. Otherwise it seems to work fine with the current meshes im dealing with.

So it wouldn't quite matter I think, It might just be more helpful for the transformer if you provide verbs about the shape, e.g. "very big chair" or something like that. The it will create a relationship with that mesh model and the text 'very big', doing this over all the big meshes will create a correlation for the transformer and help it understand what 'big' means.

StephenYangjz commented 5 months ago

Thank you so much @MarcusLoppe -- that makes a lot of sense! May I also know the reason why you have two trainers -- is it just because you wanna do learning rate scheduling at diff losses?

trainer = MeshTransformerTrainer(model = transformer,warmup_steps = 10,grad_accum_every=4,num_train_steps=100, dataset = dataset,
                                 learning_rate = 1e-3, batch_size=8)  
loss = trainer.train(100, stop_at_loss = 0.009) 

trainer = MeshTransformerTrainer(model = transformer,warmup_steps = 10,grad_accum_every=4,num_train_steps=100, dataset = dataset,
                                 learning_rate = 5e-4, batch_size=8)
loss = trainer.train(200, stop_at_loss = 0.00001)  
MarcusLoppe commented 5 months ago

Thank you so much @MarcusLoppe -- that makes a lot of sense! May I also know the reason why you have two trainers -- is it just because you wanna do learning rate scheduling at diff losses?

trainer = MeshTransformerTrainer(model = transformer,warmup_steps = 10,grad_accum_every=4,num_train_steps=100, dataset = dataset,
                                 learning_rate = 1e-3, batch_size=8)  
loss = trainer.train(100, stop_at_loss = 0.009) 

trainer = MeshTransformerTrainer(model = transformer,warmup_steps = 10,grad_accum_every=4,num_train_steps=100, dataset = dataset,
                                 learning_rate = 5e-4, batch_size=8)
loss = trainer.train(200, stop_at_loss = 0.00001)  

Correct, I at the start thought that the autoencoder could benefit from higher learning rate at the start of the training. But I discovered that it didn't really matter. I tried to implement the existing LRScheduler class but the lr scheduler that contains the stepping of learning rate isn't a base class of _LRScheduler. Accelerate requires a lr scheduler that is of the type _LRScheduler, so that one isn't compatible with the accelerator.

I also tried to recreate it by hand but there was so many issues so I didn't bother finishing it so I just stopped a specific loss and setup the training again but with lower learning rate :)

Dependant on hardware resources, it might be worth it to use a higher learning rate on the transformer, but I'm not 100% sure.

StephenYangjz commented 5 months ago

Got it thank you! I am thinking of getting more resources form my lab and train it over objectverse (thinking of trianing this on the magnitude of thousands of objects at least). Do u have have a rough estimate of the compute needed and whether its easy to parallelize it over multiple GPUs? Happy to share a pretrained model afterward and wanna hear what u think! @MarcusLoppe

MarcusLoppe commented 5 months ago

Got it thank you! I am thinking of getting more resources form my lab and train it over objectverse (thinking of trianing this on the magnitude of thousands of objects at least). Do u have have a rough estimate of the compute needed and whether its easy to parallelize it over multiple GPUs? Happy to share a pretrained model afterward and wanna hear what u think! @MarcusLoppe

I've only trained using max 250 faces, that was over 14k 3d mesh models(x15 augments), I used the 4 encoder and 8 decoder attention layers which resulted in the total parameters size is 75M The result was very good at a 0.36 MSE loss and took only like 20hrs using a single P100. Checkout the discussion i posted Pre-trained autoencoder & data sourcing #66 , you can see the results and link to the google drive with the rendered output & model.

Each epoch took 2.5hrs so it was about 8-9 epochs for a 218k dataset. It seems like the attention layers is a must if you don't want to train it much longer, without the attention I got to 0.48 loss after 30hrs+ training. When i used 1 encoder and 1 decoder layer it got down to 0.42 loss, so more attention layer definitely helps.

However the transformer takes much longer, using the model below, it takes about 4.5hrs per epoch with 8 batch size (8 grad_accum_every). If you got enough of VRAM use dim size of 1024 and either 12 or 24 attn_depth, for context: GPT-2 uses 1024 & 24 attention layer I believe. The autoencoder seem to scale with the model size so it's probably faster/better to train the transformer as big as possible.

Dependant on the GPU's it probably will take a day or two, in the paper they did 4 days with 4 A100 I believe. However this repo got massive amounts of upgrades compared to the paper. I managed to train on 14k models on the autoencoder and only took 1 day using a single P100 while it took 2 days using x4 A100 for them. So hopefully something similar might be the case for the transformer.

transformer = MeshTransformer(
    autoencoder,
    dim = 512, 
    attn_depth = 24,
    attn_heads = 8,
    coarse_pre_gateloop_depth = 6,  
    fine_pre_gateloop_depth= 4, 
    max_seq_len = max_seq, 
    condition_on_text = True,
    text_condition_model_types = "bge", 
    text_condition_cond_drop_prob = 0.01,
) 
StephenYangjz commented 5 months ago

@MarcusLoppe Thank you so much for the response! That's super helpful : )

For now, I've been training w/ the demo mesh, w/ autoencoder loss 0.277, and the transformer loss plateaued at around 0.005. I tried to keep training w/ a 3090 for a couple more hrs and the results dont seem to get much better than what I have below (when using only text prompts). Do you think this is expected or do u by any chance have any insights? Thanks in advance!

Screenshot 2024-03-14 at 1 39 28 AM
MarcusLoppe commented 5 months ago

@MarcusLoppe Thank you so much for the response! That's super helpful : )

For now, I've been training w/ the demo mesh, w/ autoencoder loss 0.277, and the transformer loss plateaued at around 0.005. I tried to keep training w/ a 3090 for a couple more hrs and the results dont seem to get much better than what I have below (when using only text prompts). Do you think this is expected or do u by any chance have any insights? Thanks in advance!

Try providing it with 10-50% of the tokens for a model and see. The original paper never used text as a guide but only the tokens it was prompted, the text guiding seems to be pretty week in the start of the generation but it's given like 10% of the tokens, it will jump start the generation and output a very good mesh

I've had some difficulty when training on small datasets, if it's just one shape it's fine and the transformer can over-fit the model. However when dealing with multiple shapes it get's harder. I'll probably exchange the demo meshes so it's the old poly mesh. Without providing any tokens I've gotten descent result with using chair, table and softa dataset but the only meshes I've flawless outputted using text-to-3d is those 4 basic shapes about cube, cone etc. But when providing about 10% tokens to jump start it, the transformer goes very well.

The issue when dealing with more complex shapes is that it needs to be more generalized. Hence why I stopped dealing with small datasets of 400-800 models and moved on to the larger 14k model dataset.

StephenYangjz commented 5 months ago

Thank makes a lot of sense thank u: )