facebookresearch / mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
https://mmf.sh/
Other
5.5k stars 939 forks source link

failed convertion from .npy to .lmdb #613

Closed aggiejiang closed 1 year ago

aggiejiang commented 4 years ago

❓ Questions and Help

I plan to use a new dataset for visual-bert in mmf. After running extract_features_vmb.py, I got a folder with .npy files and _info.npy files extracted from my image dataset.

However, when I run lmdb_conversion.py, it seems no code error in the result. But the two mdb files (data.mdb and lock.mdb) seems empty.

The result sheet

Variable OMP_NUM_THREADS has been set to 16
^M  0%|          | 0/17175 [00:00<?, ?it/s]^M  0%|          | 2/17175 [00:00<17:54, 15.98it/s]^M  0%|          | 5/17175 [00:00<15:44, 18.17it/s]^M  0%|          | 7/17175 [00:00<16:03, 17.81it/s]^M  0%|          | 11/17175 [00:00<14:10, 20.19it/s]^M  0%|          | 13/17175 [00:00<14:22, 19.91it/s]^M  0%|          | 16/17175 [00:00<14:33, 19.65it/s]^M  0%|          | 19/17175 [00:00<14:39, 19.50it/s]^M  0%|          | 22/17175 [00:01<14:10, 20.17it/s]^M  0%|          | 24/17175 [00:01<14:31, 19.67it/s]^M  0%|          | 26/17175 [00:01<15:07, 18.89it/s]^M  0%|          | 28/17175 [00:01<16:29, 17.33it/s]^M  0%|          | 31/17175 [00:01<16:01, 17.82it/s]^M  0%|          | 34/17175 [00:01<16:16, 17.55it/s]^M  0%|          | 36/17175 [00:01<15:55, 17.93it/s]^M  0%|          | 38/17175 [00:01<16:07, 17.72it/s]

My command for this file is

python /datasets/features/lmdb_conversion.py --mode convert --lmdb_path /datasets/hateful_memes/defaults/features/ --features_folder /data/home/feature

I have no idea if I set anything wrong in the code or command line. Could you please give any suggestion about this? Thanks a lot.

vedanuj commented 4 years ago

Did the conversion complete? From the result it seems only 36 files were converted.

aggiejiang commented 4 years ago

@vedanuj not yet. It stopped after that. I have no idea what happened. That's all result of 36 files processing I got after running the script.

vedanuj commented 4 years ago

Did you try it again and did it fail at the same point? If so maybe there is some issue with particular file. You can try adding a try/catch block and check which file is causing the issue.

aggiejiang commented 4 years ago

Okay. Thanks. I will have a try to figure out it as suggested and see what happen then. keep updating.

aggiejiang commented 4 years ago

Is it correct that the output has nothing? (no error either any progress bar). I allocate larger memory to it and got this result. But I am not sure it finished the whole conversion. So I try to use this into MMF model as follows

mmf_run config=projects/hateful_memes/configs/visual_bert/from_coco.yaml \
    model=visual_bert \
    dataset=hateful_memes \
    run_type=train_val  

I just try to put it into visual_bert and changed all data files.

got an error ---- KeyError: b'63859'

Does it seem some images are missing?

vedanuj commented 4 years ago

Yes it seems those files are missing in the lmdb and hence the conversion isn't completed properly. You can also try to use the .npy files directly. Just point to the folder which contains those files in your config. lmdb is not a necessary requirement.

aggiejiang commented 4 years ago

Alright. I will try that. Thanks for your help!