long8v / PTIR

Paper Today I Read
19 stars 0 forks source link

[135] Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved with Text #147

Open long8v opened 7 months ago

long8v commented 7 months ago
image

paper, code

TL;DR

Details

mmc4

image

Data Curation Process

이렇게 할 경우에 그냥 유사도 max하는 assign 하는 것보다 coverage가 높아진다.

assign 한 뒤에 Flamingo의 방식에 따라 문장 앞에 두거나 문장 뒤에 둠

image

실제 예시

image

Exploring mmc4

Result

Open Flamingo 사용해서 학습하고 LAION-2B로 학습한 애와 비교

15M의 LAION-2B로 학습된 애랑 MSCOCO caption zero-shot / 4-/ 8-shot caption 학습 한 것 비교 빨간색이 zero-shot 성능. 4, 8 shot보다 떨어지는 이유는 LAION-2B가 짧은 텍스트로만 학습되어서 긴 텍스트 나오니까 못하는거 아니냐함 2B로 비교해야되는거 아닌지;

(from FLAMINGO, coco dev set, 4shot)

image
long8v commented 6 months ago

parquet 열어보면 아래와 같음

{'image_info': [{'face_detections': None,
                 'image_name': 'b9040a0dbb22.jpg',
                 'matched_sim': 0.27694183588027954,
                 'matched_text_index': 2,
                 'raw_url': 'http://www.hfitinfo.com/honda_fit_pics/3/2/index.90.jpg'},
                {'face_detections': None,
                 'image_name': 'db1c21bc8474.jpg',
                 'matched_sim': 0.3234919607639313,
                 'matched_text_index': 1,
                 'raw_url': '[http://www.hfitinfo.com/honda_fit_pics/3/2/index.91.jpg'}](http://www.hfitinfo.com/honda_fit_pics/3/2/index.91.jpg'%7D)],
 'similarity_matrix': [[0.24363446235656738,
                        0.31758785247802734,
                        0.27694183588027954],
                       [0.2233106791973114,
                        0.3234919607639313,
                        0.26118797063827515]],
 'text_list': ['When you lock the door using the lock tab on the driver's '
               'door, all of the other doors and tailgate lock at the same '
               'time.',
               'Press the master door lock switch in as shown to lock or '
               'unlock all doors and the tailgate.',
               'When you lock/unlock the driver's door and tailgate using the '
               'master lock switch, all the other doors lock/ unlock at the '
               'same time.'],
 'url': 'http://www.hfitinfo.com/hofi-48.html',
 'could_have_url_duplicate': 0 }