I would like to conduct object detection task by utilizing a VQA model using autotrain API. I followed this guide. Accordingly, I prepared the metadata.json properly. Three columns are "file_name", "question", "multiple_choice_answer".
Sample format from the dataset:
{"file_name": "1.mrxs__12214_50922_512_512.png", "question": "This image is from 3DHistech Scanner. Where is the mitosis location(four properties of the bounding box: top left x coordinate, top left y coordinate, width, height) in this image?", "multiple_choice_answer": [[181, 199, 43, 42]]}
I tried to use google/paligemma-3b-ft-coco35l-448 and google/paligemma-3b-mix-448 models for this purpose. When I start the process with this command: autotrain --config config.yml
It loads the dataset properly. Everthing seems fine until the training started:
I would like to conduct object detection task by utilizing a VQA model using autotrain API. I followed this guide. Accordingly, I prepared the metadata.json properly. Three columns are "file_name", "question", "multiple_choice_answer". Sample format from the dataset:
{"file_name": "1.mrxs__12214_50922_512_512.png", "question": "This image is from 3DHistech Scanner. Where is the mitosis location(four properties of the bounding box: top left x coordinate, top left y coordinate, width, height) in this image?", "multiple_choice_answer": [[181, 199, 43, 42]]}
I tried to use google/paligemma-3b-ft-coco35l-448 and google/paligemma-3b-mix-448 models for this purpose. When I start the process with this command:
autotrain --config config.yml
It loads the dataset properly. Everthing seems fine until the training started:
Here is the error: