algvr / transfusion

Official implementation for the CVPR 2024 paper "Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation"
https://eth-ait.github.io/transfusion-proj/
4 stars 0 forks source link

Missing EgoNaoDataset class #3

Open ChuhuaW opened 1 week ago

ChuhuaW commented 1 week ago

Hi,

Thanks for sharing the work. Would you mind also releasing the EgoNaoDataset class? Thedata_preproceessing/datasets/ folder is missing. And it would be helpful if you can also release the code for modeling/narration_embeds/datasets.

Thank you!

Best,

algvr commented 4 days ago

Hi ChuhuaW,

thank you for your interest in our work. The missing files have been added. Please let me know whether the code works now.

ChuhuaW commented 4 days ago

@algvr Thanks! I’m able to get the model running. However, when I load the checkpoint "translated_ego4dv2.pth", I encountered aKeyError:

File "/nfs/tynamo/home/data/cw234/transfusion/runner/nao/egov2_naf_trainer.py", line 65, in load_state_dict
    if ckpt_pos_embedding.shape[1] < self.state_dict()[k].shape[1]:
KeyError: 'model.cross_fusion_layers.0.pos_embedding_layer.pos_embedding'

Upon further investigation, it seems there might be multiple keys in the state_dict with unexpected/missing/size mismatch. I’ve uploaded the full log here. Could you please take a look? Thank you.

output.log

algvr commented 2 days ago

Hi ChuhuaW,

thank you for providing the log file. I cannot reproduce the issue using our provided codebase. Are you trying to load translated_ego4dv2.pth using the Trainer.fit function of PyTorch Lightning, or perhaps using the --resume_from argument? This file does not contain a checkpoint of our entire model, but rather the visual encoder weights from the original Ego4D baseline, which is the initialization point for our visual encoder. These weights are loaded automatically in the get_rcnn_model function of modeling/obj_detection/rcnn_factory.py when using the default configuration (pretrained set to ${CODE}/checkpoints/translated_ego4dv2.pth in the configuration YAML), so no further action is needed to load them.

ChuhuaW commented 2 days ago

Hi @algvr,

Thanks for the clarification! I’ll start training on my end and get back to you once I have some results. By any chances are you going to release the checkpoint our your entire model?