Open lshymfl opened 2 months ago
This is an example where each line represents a training sample. In each line, for example, /data/img1.png is the path of the image, and 'cat' is its corresponding caption. The two are separated by a tab. /data/img1.png cat /data/img2.png forest /data/img3.png hamburger ...
"To prepare the training data, the caption file should be organized in the following format, where each line contains the image path and its corresponding caption separated by a tab (\t): image_path1 caption1 image_path2 caption2 ... " Can you list examples so that we can understand and pre-process the data?