ShihaoZhaoZSH / LaVi-Bridge

Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
MIT License
287 stars 20 forks source link

Training data preprocessing #14

Open lshymfl opened 2 months ago

lshymfl commented 2 months ago

"To prepare the training data, the caption file should be organized in the following format, where each line contains the image path and its corresponding caption separated by a tab (\t): image_path1 caption1 image_path2 caption2 ... " Can you list examples so that we can understand and pre-process the data?

ShihaoZhaoZSH commented 1 month ago

This is an example where each line represents a training sample. In each line, for example, /data/img1.png is the path of the image, and 'cat' is its corresponding caption. The two are separated by a tab. /data/img1.png cat /data/img2.png forest /data/img3.png hamburger ...