Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
https://otter-ntu.github.io/
MIT License
3.56k stars 242 forks source link

Data issues #172

Open zcczhang opened 1 year ago

zcczhang commented 1 year ago

Hi, thanks for the amazing work and released MIMIC-IT! seems there're a few issues:

For other datasets, it would be great to release the processed x.json file (I noticed the egg version would be coming soon) as some datasets are too old to acquire/process and some video datasets are large. Thank you!

Luodian commented 1 year ago

Thanks for bringing up these issues.

It seems related to convert-it process right? Current convert-it can not generate correct image_ids corresponding to those ids in xx_instructions.json and `xx_train.json.

We first converted our xx.json for internal use, and then back to wrote the "convert-it" to assist users to obtain xx.json from public datasets. However, it seems there might be some potential issues with the IDs during this conversion process. We are currently investigating the matter and appreciate your patience while we address the problem.

updates:

  1. meta link of LLaVA-In-Context is updating: meta
zcczhang commented 1 year ago

saw the pr above, just wonder if coco general difference train and instruction json files are available. Thanks!

zcczhang commented 1 year ago

Hi @Luodian , just wondering when SD (COCO general diffference version) instructions and train configs would be ready in one drive folder?

Luodian commented 1 year ago

Hi @Luodian , just wondering when SD (COCO general diffference version) instructions and train configs would be ready in one drive folder?

Hi sorry I didnt see the message last week. The files are already in our side. We may wait @king159 J to do a final check then expectedly release it today.

zcczhang commented 1 year ago

Thanks for the quick response!

Luodian commented 1 year ago

@pufanyi @king159

zcczhang commented 1 year ago

Please let me know when it's ready (and maybe also the E4D egg) for my download!

Luodian commented 1 year ago

Please let me know when it's ready (and maybe also the E4D egg) for my download!

Hi COCO Difference instructions/train json have been uploaded and raw image json is uploading now~

zcczhang commented 1 year ago

That sounds great thanks! I think I have the image JSON file processed before. Btw will the egg for E4D be available? or is it too large to upload? (another minor btw: I'm not super familiar with one-drive but are there any better suggestions to directly download from the link to the headless server?)