GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation
https://huggingface.co/spaces/ethanchern/Anole
641 stars 36 forks source link

Inf Loss Problem When Training #30

Closed nreHieW closed 1 month ago

nreHieW commented 1 month ago

In line 1628-1629 of transformers/src/transformers/models/chameleon/modeling_chameleon.py

image_tokens = self.model.vocabulary_mapping.image_tokens
logits[:, :, image_tokens] = torch.finfo(logits.dtype).min

My understanding is that this is here because the original Chameleon intentionally did not want to generate any image tokens. But keeping this in for training would lead to inf loss.

Is there an updated version of the code?

leloykun commented 1 month ago

Hi! You can either use the version in the transformers folder or this PR of mine to the main Transformers library: https://github.com/huggingface/transformers/pull/32013