Alpha-VLLM / Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation
MIT License
1.84k stars 76 forks source link

which llama 7b version use? #12

Open trouble-maker007 opened 1 month ago

trouble-maker007 commented 1 month ago

did you use llama 7b in the training with the InternViT−6B? and is there any plan to release a technical report?

Artanic30 commented 1 month ago

We use original llama2 7b from meta without any modification. There is no usage of InternViT−6B. More details could be found in our technical report, which will be released soon.

trouble-maker007 commented 1 month ago

@Artanic30 Thanks for the quick response. And did you compare with flant5-xxl as the text encoder

gaopengpjlab commented 1 month ago

We release Flag-DiT-5B with LLaMa-7B as text encoder along with Next-DiT-2B with gemma-2B. Compared with flant5-xxl, gemma-2B demonstrate superior multilingual ability. Best wishes. More details are coming soon.

Artanic30 commented 1 month ago

We clarify that the specific version of llama2 text encoder is meta-llama/Llama-2-7b-hf.