mczhuge / Kaleido-BERT

💐Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
MIT License
263 stars 19 forks source link

How to generate input_schema format data? #14

Closed Nidhi-kumari closed 2 years ago

Nidhi-kumari commented 2 years ago

Hi, I find your work very interesting, and it is aligned with my project requirements. I want to fine tune it for custom dataset, where I have raw images and text with labels, the task is similar to "Category/SubCategory Recognition". How to get the data in input_schema format? Please share the code if you have any.

mczhuge commented 2 years ago

Sorry for the late. Since the preprocess procedure is running on Alibaba ODPS (which is a private SQL tool), so the code is not valuable for others' implementation.

But you can refer to #3 , actually, writing a multiprocess python code is also okay. All the best!