template differences？

TinyLLaVA / TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

https://arxiv.org/abs/2402.14289

Apache License 2.0

476 stars 39 forks source link

template differences？ #77

Open tsw123678 opened 3 weeks ago

tsw123678 commented 3 weeks ago

Are there any differences in the _make_masks function across different LLM models? Don't they all compute loss only for the response part? What causes the variations among them?

jiajunlong commented 3 weeks ago

Different models use different tokenizers, and when different tokenizers tokenize the text, the corresponding label positions are different.