brando90 / pytorch-meta-dataset

A non-official 100% PyTorch implementation of META-DATASET benchmark for few-shot classification
0 stars 0 forks source link

mscoco missing img #20

Open brando90 opened 1 year ago

brando90 commented 1 year ago
(mds_env_gpu) brando9~/data/mds/mscoco $ python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
>   --dataset=mscoco \
>   --mscoco_data_root=$MDS_DATA_PATH/mscoco \
>   --splits_root=$SPLITS \
>   --records_root=$RECORDS
2023-01-05 16:41:14.949530: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64:/usr/local/cuda-11.7/lib64:
2023-01-05 16:41:14.949583: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Traceback (most recent call last):
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 157, in <module>
    tf.app.run(main)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/platform/app.py", line 36, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 147, in main
    converter = converter_class(
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 1357, in __init__
    raise ValueError('Annotation file %s does not exist' % annotation_path)
ValueError: Annotation file /lfs/ampere4/0/brando9/data/mds/mscoco/instances_train2017.json does not exist

related: https://github.com/google-research/meta-dataset/issues/106

brando90 commented 1 year ago

@patricks-lab did this ever happen to you?

https://github.com/google-research/meta-dataset/issues/106

patricks-lab commented 1 year ago

Hmmm.. I guess you need to just make sure that your folder structure looks something like this (according to the github instructions from original mds):

$MDS_DATA_PATH/mscoco contents: |--- train2017/ folder (this is from doing "gsutil -m rsync gs://images.cocodataset.org/train2017 train2017" or downloading http://images.cocodataset.org/zips/train2017.zip) |--- captions_train2017.json, captions_val2017.json, instances_train2017.json, instances_val2017.json, person_keypoints_train2017.json, person_keypoints_val2017.json (the above error complains you don't have instances_train2017.json)

My guess what happened is that after you "unzip annotations_trainval2017.zip" there was actually an extra step to move all the annotations to the parent directory that i missed.

Try something like cd $MDS_DATA_PATH/mscoco/annotations and then mv * ../.

brando90 commented 1 year ago

@patricks-lab what is your output of ls $RECORDS/mscoco/ | grep -c .tfrecords?

brando90 commented 1 year ago

My guess what happened is that after you "unzip annotations_trainval2017.zip" there was actually an extra step to move all the annotations to the parent directory that i missed.

Try something like cd $MDS_DATA_PATH/mscoco/annotations and then mv * ../.

this is to vague. Can you give all instructions you'd run from the beginning to extract mscoco? Look at the ones i have and correct them or add commands wherever you feel it's necessary:

#-- mscoco
ssh brando9@ampere4.stanford.edu
tmux new -s mscoco
tmux new -s mscoco2
reauth
source $AFS/.bashrc.lfs
conda activate mds_env_gpu

# 1. Download the 2017 train images and annotations from http://cocodataset.org/:
#You can use gsutil to download them to mscoco/:
#cd $DATASRC/mscoco/ mkdir -p train2017
#gsutil -m rsync gs://images.cocodataset.org/train2017 train2017
#gsutil -m cp gs://images.cocodataset.org/annotations/annotations_trainval2017.zip
#unzip annotations_trainval2017.zip

# Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/.
mkdir -p $MDS_DATA_PATH/mscoco
wget http://images.cocodataset.org/zips/train2017.zip -O $MDS_DATA_PATH/mscoco/train2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip
# both zips should be there, note: downloading zip takes some time
ls $MDS_DATA_PATH/mscoco/

# extract, takes some time, but good progress display
unzip $MDS_DATA_PATH/mscoco/train2017.zip -d $MDS_DATA_PATH/mscoco
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco

# two folders should be there, annotations and train2017 stuff
ls $MDS_DATA_PATH/mscoco/

# 2. Launch the conversion script:
python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
  --dataset=mscoco \
  --mscoco_data_root=$MDS_DATA_PATH/mscoco \
  --splits_root=$SPLITS \
  --records_root=$RECORDS

# 3. Expect the conversion to take about 4 hours.

# 4. Find the following outputs in $RECORDS/mscoco/:
#80 tfrecords files named [0-79].tfrecords
ls $RECORDS/mscoco/ | grep -c .tfrecords
#dataset_spec.json (see note 1)
ls $RECORDS/mscoco/dataset_spec.json
brando90 commented 1 year ago

Try something like cd $MDS_DATA_PATH/mscoco/annotations and then mv * ../.

also the instructions say:

Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/.

so I'm confused why your suggesting to move the annotations but not the train2017 folder contents @patricks-lab .

brando90 commented 1 year ago

is this what I should be doing?

mv $MDS_DATA_PATH/mscoco/train2017/* $MDS_DATA_PATH/mscoco/
mv $MDS_DATA_PATH/mscoco/annotations/* $MDS_DATA_PATH/mscoco/

full new script for mscoco


#-- mscoco
ssh brando9@ampere4.stanford.edu
tmux new -s mscoco
tmux new -s mscoco2
reauth
source $AFS/.bashrc.lfs
conda activate mds_env_gpu

# 1. Download the 2017 train images and annotations from http://cocodataset.org/:
#You can use gsutil to download them to mscoco/:
#cd $DATASRC/mscoco/ mkdir -p train2017
#gsutil -m rsync gs://images.cocodataset.org/train2017 train2017
#gsutil -m cp gs://images.cocodataset.org/annotations/annotations_trainval2017.zip
#unzip annotations_trainval2017.zip

# Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/.
mkdir -p $MDS_DATA_PATH/mscoco
wget http://images.cocodataset.org/zips/train2017.zip -O $MDS_DATA_PATH/mscoco/train2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip
# both zips should be there, note: downloading zip takes some time
ls $MDS_DATA_PATH/mscoco/

# extract them into mscoco/ (interpreting that as extracting both there, also due to how th gsutil command above looks like is doing)
# takes some time, but good progress display
unzip $MDS_DATA_PATH/mscoco/train2017.zip -d $MDS_DATA_PATH/mscoco
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco
# check jpg imgs are there
ls $MDS_DATA_PATH/mscoco/train2017
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
ls $MDS_DATA_PATH/mscoco/annotations
ls $MDS_DATA_PATH/mscoco/annotations | grep -c .json
# move them since it says so in the natural language instructions
mv $MDS_DATA_PATH/mscoco/train2017/* $MDS_DATA_PATH/mscoco/
mv $MDS_DATA_PATH/mscoco/annotations/* $MDS_DATA_PATH/mscoco/
# two folders should be there, annotations and train2017 stuff
ls $MDS_DATA_PATH/mscoco/

# 2. Launch the conversion script:
python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
  --dataset=mscoco \
  --mscoco_data_root=$MDS_DATA_PATH/mscoco \
  --splits_root=$SPLITS \
  --records_root=$RECORDS

# 3. Expect the conversion to take about 4 hours.

# 4. Find the following outputs in $RECORDS/mscoco/:
#80 tfrecords files named [0-79].tfrecords
ls $RECORDS/mscoco/ | grep -c .tfrecords
#dataset_spec.json (see note 1)
ls $RECORDS/mscoco/dataset_spec.json```
brando90 commented 1 year ago

new script:

#-- mscoco
ssh brando9@ampere4.stanford.edu
tmux new -s mscoco
tmux new -s mscoco2
reauth
source $AFS/.bashrc.lfs
conda activate mds_env_gpu

# 1. Download the 2017 train images and annotations from http://cocodataset.org/:
#You can use gsutil to download them to mscoco/:
#cd $DATASRC/mscoco/ mkdir -p train2017
#gsutil -m rsync gs://images.cocodataset.org/train2017 train2017
#gsutil -m cp gs://images.cocodataset.org/annotations/annotations_trainval2017.zip
#unzip annotations_trainval2017.zip

# Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/.
mkdir -p $MDS_DATA_PATH/mscoco
wget http://images.cocodataset.org/zips/train2017.zip -O $MDS_DATA_PATH/mscoco/train2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip
# both zips should be there, note: downloading zip takes some time
ls $MDS_DATA_PATH/mscoco/

# extract them into mscoco/ (interpreting that as extracting both there, also due to how th gsutil command above looks like is doing)
# takes some time, but good progress display
unzip $MDS_DATA_PATH/mscoco/train2017.zip -d $MDS_DATA_PATH/mscoco
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco
# two folders should be there, annotations and train2017 stuff
ls $MDS_DATA_PATH/mscoco/
# check jpg imgs are there
ls $MDS_DATA_PATH/mscoco/train2017
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
ls $MDS_DATA_PATH/mscoco/annotations
ls $MDS_DATA_PATH/mscoco/annotations | grep -c .json
# move them since it says so in the natural language instructions ref: https://stackoverflow.com/a/75034830/1601580
find $MDS_DATA_PATH/mscoco/train2017 -type f -print0 | xargs -0 mv -t $MDS_DATA_PATH/mscoco
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
ls $MDS_DATA_PATH/mscoco | grep -c .jpg
mv $MDS_DATA_PATH/mscoco/annotations/* $MDS_DATA_PATH/mscoco/
ls $MDS_DATA_PATH/mscoco/ | grep -c .json

# 2. Launch the conversion script:
python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
  --dataset=mscoco \
  --mscoco_data_root=$MDS_DATA_PATH/mscoco \
  --splits_root=$SPLITS \
  --records_root=$RECORDS

# 3. Expect the conversion to take about 4 hours.

# 4. Find the following outputs in $RECORDS/mscoco/:
#80 tfrecords files named [0-79].tfrecords
ls $RECORDS/mscoco/ | grep -c .tfrecords
#dataset_spec.json (see note 1)
ls $RECORDS/mscoco/dataset_spec.json
brando90 commented 1 year ago

New error:

(mds_env_gpu) brando9~/data/mds/mscoco $ python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
>   --dataset=mscoco \
>   --mscoco_data_root=$MDS_DATA_PATH/mscoco \
>   --splits_root=$SPLITS \
>   --records_root=$RECORDS

2023-01-06 10:30:10.139278: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64:/usr/local/cuda-11.7/lib64:
2023-01-06 10:30:10.139308: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

I0106 10:30:24.987533 140220462260864 convert_datasets_to_records.py:151] Creating MSCOCO specification and records in directory /lfs/ampere4/0/brando9/data/mds/records/mscoco...
I0106 10:30:24.987753 140220462260864 dataset_to_records.py:649] Attempting to read splits from /lfs/ampere4/0/brando9/data/mds/splits/mscoco_splits.json...
I0106 10:30:24.988260 140220462260864 dataset_to_records.py:661] Unsuccessful.
I0106 10:30:24.988714 140220462260864 dataset_to_records.py:233] Created splits with 0 train, 40 validation and 40 test classes.
I0106 10:30:24.988792 140220462260864 dataset_to_records.py:701] Saving new splits for dataset mscoco at /lfs/ampere4/0/brando9/data/mds/splits/mscoco_splits.json...
I0106 10:30:24.989097 140220462260864 dataset_to_records.py:705] Done.
Traceback (most recent call last):
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 157, in <module>
    tf.app.run(main)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/platform/app.py", line 36, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 153, in main
    converter.convert_dataset()
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 610, in convert_dataset
    self.create_dataset_specification_and_records()
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 1463, in create_dataset_specification_and_records
    image_crop, class_id = get_image_crop_and_class_id(annotation)
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 1425, in get_image_crop_and_class_id
    image = Image.open(f)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/PIL/Image.py", line 2957, in open
    fp.seek(0)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/util/deprecation.py", line 548, in new_func
    return func(*args, **kwargs)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 137, in seek
    self._preread_check()
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 76, in _preread_check
    self._read_buf = _pywrap_file_io.BufferedInputStream(
tensorflow.python.framework.errors_impl.NotFoundError: /lfs/ampere4/0/brando9/data/mds/mscoco/train2017/000000558840.jpg; No such file or directory

seems I will have to redownload everything right?

brando90 commented 1 year ago

@patricks-lab I re ran all my steps again from scratch and still get the same error. Script I ran. All sanity checks in btw failed and still failed. I think I need you to write your own mscoco from scratch start to finish and test it in the DGX machine since I can't get this one to work. You can start from my script:

# Download Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/. eta ~36m.
mkdir -p $MDS_DATA_PATH/mscoco
wget http://images.cocodataset.org/zips/train2017.zip -O $MDS_DATA_PATH/mscoco/train2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip
# both zips should be there, note: downloading zip takes some time
ls $MDS_DATA_PATH/mscoco/
# Extract them into mscoco/ (interpreting that as extracting both there, also due to how th gsutil command above looks like is doing)
# takes some time, but good progress display
unzip $MDS_DATA_PATH/mscoco/train2017.zip -d $MDS_DATA_PATH/mscoco
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco
# two folders should be there, annotations and train2017 stuff
ls $MDS_DATA_PATH/mscoco/
# check jpg imgs are there
ls $MDS_DATA_PATH/mscoco/train2017
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
# says: 118287 for a 2nd time
ls $MDS_DATA_PATH/mscoco/annotations
ls $MDS_DATA_PATH/mscoco/annotations | grep -c .json
# says: 6 for a 2nd time
# move them since it says so in the google NL instructions ref: for moving large num files https://stackoverflow.com/a/75034830/1601580 thanks chatgpt!
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
find $MDS_DATA_PATH/mscoco/train2017 -type f -print0 | xargs -0 mv -t $MDS_DATA_PATH/mscoco
ls $MDS_DATA_PATH/mscoco | grep -c .jpg
# says: 118287 for both
ls $MDS_DATA_PATH/mscoco/annotations/ | grep -c .json
mv $MDS_DATA_PATH/mscoco/annotations/* $MDS_DATA_PATH/mscoco/
ls $MDS_DATA_PATH/mscoco/ | grep -c .json
# says: 6 for both

# NEXT FAILS with missing image
# 2. Launch the conversion script:
python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
  --dataset=mscoco \
  --mscoco_data_root=$MDS_DATA_PATH/mscoco \
  --splits_root=$SPLITS \
  --records_root=$RECORDS
brando90 commented 1 year ago

ok tried the gsutil approach and it fails: https://github.com/google-research/meta-dataset/issues/108 I think I really do need you to try your own version @patricks-lab . Code I ran:

# 1. Download the 2017 train images and annotations from http://cocodataset.org/:
#You can use gsutil to download them to mscoco/:
mkdir -p $MDS_DATA_PATH/mscoco/
cd $MDS_DATA_PATH/mscoco/
mkdir -p train2017
# seems to directly download all files, no zip file needed
gsutil -m rsync gs://images.cocodataset.org/train2017 train2017
# todo should have 118287? number of .jpg files (note no unziping needed)
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
# download & extract annotations_trainval2017.zip
gsutil -m cp gs://images.cocodataset.org/annotations/annotations_trainval2017.zip
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco
# todo says: 6?
ls $MDS_DATA_PATH/mscoco/annotations | grep -c .json

error:

(mds_env_gpu) brando9~/data/mds/mscoco $ cd ..
(mds_env_gpu) brando9~/data/mds $ rm -rf mscoco/
(mds_env_gpu) brando9~/data/mds $ mkdir -p $MDS_DATA_PATH/mscoco/
(mds_env_gpu) brando9~/data/mds $ cd $MDS_DATA_PATH/mscoco/
(mds_env_gpu) brando9~/data/mds/mscoco $ mkdir -p train2017
(mds_env_gpu) brando9~/data/mds/mscoco $ gsutil -m rsync gs://images.cocodataset.org/train2017 train2017

BucketNotFoundException: 404 gs://images.cocodataset.org bucket does not exist.
brando90 commented 1 year ago

I do prefer if we have the official instructions from google to work, but in the super worst case you can do the next option:

  1. verify first that ur mscoco tfrecords data looks good ls $RECORDS/mscoco/ | grep -c .tfrecords 80 tfrecords it should display.
  2. download your mscoco tfrecords data from the vision cluster to your local computer once you verify 80 tfrecord (or skip this step and do 3 i.e. upload to zenodo directly from the vision cluster somehow)
  3. follow the zenodo instructions to upload your data: https://zenodo.org/deposit?page=1&size=20, make sure you are allowed to upload the amount of data they let you upload. I think if I remember correctly 50GBs is the limit. If not we might have to find a different way to get it to me.

@patricks-lab

brando90 commented 1 year ago

I do prefer if we have the official instructions from google to work, but in the super worst case you can do the next option:

  1. verify first that ur mscoco tfrecords data looks good ls $RECORDS/mscoco/ | grep -c .tfrecords 80 tfrecords it should display.
  2. download your mscoco tfrecords data from the vision cluster to your local computer once you verify 80 tfrecord (or skip this step and do 3 i.e. upload to zenodo directly from the vision cluster somehow)
  3. follow the zenodo instructions to upload your data: https://zenodo.org/deposit?page=1&size=20, make sure you are allowed to upload the amount of data they let you upload. I think if I remember correctly 50GBs is the limit. If not we might have to find a different way to get it to me.

@patricks-lab

perhaps this is better? https://github.com/google-research/meta-dataset/blob/main/meta_dataset/data/tfds/README.md

than uploading to zenodo?