jy0205 / LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
Other
438 stars 22 forks source link

RuntimeError: expected scalar type Float but found BFloat16 #24

Closed patrick-tssn closed 2 months ago

patrick-tssn commented 2 months ago

Following this environment setting: https://github.com/jy0205/LaVIT/tree/main/VideoLaVIT#requirements, when running this script: https://github.com/jy0205/LaVIT/blob/main/VideoLaVIT/understanding.ipynb, I encounter this Error.

image

Could you please reassure the environment?

patrick-tssn commented 2 months ago

I've implemented a temporary solution: in the files modeling_visual_encoder.py, modeling_visual_tokenizer.py, and modeling_motion_tokenizer.py, I modified the LayerNorm function to use torch.bfloat16 instead of torch.float32. This adjustment is effective for inference; however, I am uncertain about its compatibility with the training pipeline. I look forward to your feedback on this matter.

jy0205 commented 2 months ago

Thank you for identifying this potential issue when the apex is not installed. We use the fusedlayernorm in apex during training. Your temporal solution is right.