mscoco missing img - Githubissues

brando90 commented 1 year ago

(mds_env_gpu) brando9~/data/mds/mscoco $ python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
>   --dataset=mscoco \
>   --mscoco_data_root=$MDS_DATA_PATH/mscoco \
>   --splits_root=$SPLITS \
>   --records_root=$RECORDS
2023-01-05 16:41:14.949530: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64:/usr/local/cuda-11.7/lib64:
2023-01-05 16:41:14.949583: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
Traceback (most recent call last):
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 157, in <module>
    tf.app.run(main)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/platform/app.py", line 36, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 147, in main
    converter = converter_class(
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 1357, in __init__
    raise ValueError('Annotation file %s does not exist' % annotation_path)
ValueError: Annotation file /lfs/ampere4/0/brando9/data/mds/mscoco/instances_train2017.json does not exist

brando90 commented 1 year ago

@patricks-lab did this ever happen to you?

https://github.com/google-research/meta-dataset/issues/106

patricks-lab commented 1 year ago

Hmmm.. I guess you need to just make sure that your folder structure looks something like this (according to the github instructions from original mds):

$MDS_DATA_PATH/mscoco contents: |--- train2017/ folder (this is from doing "gsutil -m rsync gs://images.cocodataset.org/train2017 train2017" or downloading http://images.cocodataset.org/zips/train2017.zip) |--- captions_train2017.json, captions_val2017.json, instances_train2017.json, instances_val2017.json, person_keypoints_train2017.json, person_keypoints_val2017.json (the above error complains you don't have instances_train2017.json)

My guess what happened is that after you "unzip annotations_trainval2017.zip" there was actually an extra step to move all the annotations to the parent directory that i missed.

Try something like cd $MDS_DATA_PATH/mscoco/annotations and then mv * ../.

brando90 commented 1 year ago

@patricks-lab what is your output of ls $RECORDS/mscoco/ | grep -c .tfrecords?

brando90 commented 1 year ago

My guess what happened is that after you "unzip annotations_trainval2017.zip" there was actually an extra step to move all the annotations to the parent directory that i missed.

Try something like cd $MDS_DATA_PATH/mscoco/annotations and then mv * ../.

this is to vague. Can you give all instructions you'd run from the beginning to extract mscoco? Look at the ones i have and correct them or add commands wherever you feel it's necessary:

#-- mscoco
ssh brando9@ampere4.stanford.edu
tmux new -s mscoco
tmux new -s mscoco2
reauth
source $AFS/.bashrc.lfs
conda activate mds_env_gpu

# 1. Download the 2017 train images and annotations from http://cocodataset.org/:
#You can use gsutil to download them to mscoco/:
#cd $DATASRC/mscoco/ mkdir -p train2017
#gsutil -m rsync gs://images.cocodataset.org/train2017 train2017
#gsutil -m cp gs://images.cocodataset.org/annotations/annotations_trainval2017.zip
#unzip annotations_trainval2017.zip

# Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/.
mkdir -p $MDS_DATA_PATH/mscoco
wget http://images.cocodataset.org/zips/train2017.zip -O $MDS_DATA_PATH/mscoco/train2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip
# both zips should be there, note: downloading zip takes some time
ls $MDS_DATA_PATH/mscoco/

# extract, takes some time, but good progress display
unzip $MDS_DATA_PATH/mscoco/train2017.zip -d $MDS_DATA_PATH/mscoco
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco

# two folders should be there, annotations and train2017 stuff
ls $MDS_DATA_PATH/mscoco/

# 2. Launch the conversion script:
python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
  --dataset=mscoco \
  --mscoco_data_root=$MDS_DATA_PATH/mscoco \
  --splits_root=$SPLITS \
  --records_root=$RECORDS

# 3. Expect the conversion to take about 4 hours.

# 4. Find the following outputs in $RECORDS/mscoco/:
#80 tfrecords files named [0-79].tfrecords
ls $RECORDS/mscoco/ | grep -c .tfrecords
#dataset_spec.json (see note 1)
ls $RECORDS/mscoco/dataset_spec.json

brando90 commented 1 year ago

Try something like cd $MDS_DATA_PATH/mscoco/annotations and then mv * ../.

also the instructions say:

Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/.

so I'm confused why your suggesting to move the annotations but not the train2017 folder contents @patricks-lab .

brando90 commented 1 year ago

is this what I should be doing?

mv $MDS_DATA_PATH/mscoco/train2017/* $MDS_DATA_PATH/mscoco/
mv $MDS_DATA_PATH/mscoco/annotations/* $MDS_DATA_PATH/mscoco/

full new script for mscoco


#-- mscoco
ssh brando9@ampere4.stanford.edu
tmux new -s mscoco
tmux new -s mscoco2
reauth
source $AFS/.bashrc.lfs
conda activate mds_env_gpu

# 1. Download the 2017 train images and annotations from http://cocodataset.org/:
#You can use gsutil to download them to mscoco/:
#cd $DATASRC/mscoco/ mkdir -p train2017
#gsutil -m rsync gs://images.cocodataset.org/train2017 train2017
#gsutil -m cp gs://images.cocodataset.org/annotations/annotations_trainval2017.zip
#unzip annotations_trainval2017.zip

# Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/.
mkdir -p $MDS_DATA_PATH/mscoco
wget http://images.cocodataset.org/zips/train2017.zip -O $MDS_DATA_PATH/mscoco/train2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip
# both zips should be there, note: downloading zip takes some time
ls $MDS_DATA_PATH/mscoco/

# extract them into mscoco/ (interpreting that as extracting both there, also due to how th gsutil command above looks like is doing)
# takes some time, but good progress display
unzip $MDS_DATA_PATH/mscoco/train2017.zip -d $MDS_DATA_PATH/mscoco
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco
# check jpg imgs are there
ls $MDS_DATA_PATH/mscoco/train2017
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
ls $MDS_DATA_PATH/mscoco/annotations
ls $MDS_DATA_PATH/mscoco/annotations | grep -c .json
# move them since it says so in the natural language instructions
mv $MDS_DATA_PATH/mscoco/train2017/* $MDS_DATA_PATH/mscoco/
mv $MDS_DATA_PATH/mscoco/annotations/* $MDS_DATA_PATH/mscoco/
# two folders should be there, annotations and train2017 stuff
ls $MDS_DATA_PATH/mscoco/

# 2. Launch the conversion script:
python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
  --dataset=mscoco \
  --mscoco_data_root=$MDS_DATA_PATH/mscoco \
  --splits_root=$SPLITS \
  --records_root=$RECORDS

# 3. Expect the conversion to take about 4 hours.

# 4. Find the following outputs in $RECORDS/mscoco/:
#80 tfrecords files named [0-79].tfrecords
ls $RECORDS/mscoco/ | grep -c .tfrecords
#dataset_spec.json (see note 1)
ls $RECORDS/mscoco/dataset_spec.json```

brando90 commented 1 year ago

new script:

#-- mscoco
ssh brando9@ampere4.stanford.edu
tmux new -s mscoco
tmux new -s mscoco2
reauth
source $AFS/.bashrc.lfs
conda activate mds_env_gpu

# 1. Download the 2017 train images and annotations from http://cocodataset.org/:
#You can use gsutil to download them to mscoco/:
#cd $DATASRC/mscoco/ mkdir -p train2017
#gsutil -m rsync gs://images.cocodataset.org/train2017 train2017
#gsutil -m cp gs://images.cocodataset.org/annotations/annotations_trainval2017.zip
#unzip annotations_trainval2017.zip

# Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/.
mkdir -p $MDS_DATA_PATH/mscoco
wget http://images.cocodataset.org/zips/train2017.zip -O $MDS_DATA_PATH/mscoco/train2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip
# both zips should be there, note: downloading zip takes some time
ls $MDS_DATA_PATH/mscoco/

# extract them into mscoco/ (interpreting that as extracting both there, also due to how th gsutil command above looks like is doing)
# takes some time, but good progress display
unzip $MDS_DATA_PATH/mscoco/train2017.zip -d $MDS_DATA_PATH/mscoco
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco
# two folders should be there, annotations and train2017 stuff
ls $MDS_DATA_PATH/mscoco/
# check jpg imgs are there
ls $MDS_DATA_PATH/mscoco/train2017
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
ls $MDS_DATA_PATH/mscoco/annotations
ls $MDS_DATA_PATH/mscoco/annotations | grep -c .json
# move them since it says so in the natural language instructions ref: https://stackoverflow.com/a/75034830/1601580
find $MDS_DATA_PATH/mscoco/train2017 -type f -print0 | xargs -0 mv -t $MDS_DATA_PATH/mscoco
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
ls $MDS_DATA_PATH/mscoco | grep -c .jpg
mv $MDS_DATA_PATH/mscoco/annotations/* $MDS_DATA_PATH/mscoco/
ls $MDS_DATA_PATH/mscoco/ | grep -c .json

# 2. Launch the conversion script:
python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
  --dataset=mscoco \
  --mscoco_data_root=$MDS_DATA_PATH/mscoco \
  --splits_root=$SPLITS \
  --records_root=$RECORDS

# 3. Expect the conversion to take about 4 hours.

# 4. Find the following outputs in $RECORDS/mscoco/:
#80 tfrecords files named [0-79].tfrecords
ls $RECORDS/mscoco/ | grep -c .tfrecords
#dataset_spec.json (see note 1)
ls $RECORDS/mscoco/dataset_spec.json

brando90 commented 1 year ago

New error:

(mds_env_gpu) brando9~/data/mds/mscoco $ python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
>   --dataset=mscoco \
>   --mscoco_data_root=$MDS_DATA_PATH/mscoco \
>   --splits_root=$SPLITS \
>   --records_root=$RECORDS

2023-01-06 10:30:10.139278: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64:/usr/local/cuda-11.7/lib64:
2023-01-06 10:30:10.139308: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

I0106 10:30:24.987533 140220462260864 convert_datasets_to_records.py:151] Creating MSCOCO specification and records in directory /lfs/ampere4/0/brando9/data/mds/records/mscoco...
I0106 10:30:24.987753 140220462260864 dataset_to_records.py:649] Attempting to read splits from /lfs/ampere4/0/brando9/data/mds/splits/mscoco_splits.json...
I0106 10:30:24.988260 140220462260864 dataset_to_records.py:661] Unsuccessful.
I0106 10:30:24.988714 140220462260864 dataset_to_records.py:233] Created splits with 0 train, 40 validation and 40 test classes.
I0106 10:30:24.988792 140220462260864 dataset_to_records.py:701] Saving new splits for dataset mscoco at /lfs/ampere4/0/brando9/data/mds/splits/mscoco_splits.json...
I0106 10:30:24.989097 140220462260864 dataset_to_records.py:705] Done.
Traceback (most recent call last):
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 157, in <module>
    tf.app.run(main)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/platform/app.py", line 36, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 153, in main
    converter.convert_dataset()
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 610, in convert_dataset
    self.create_dataset_specification_and_records()
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 1463, in create_dataset_specification_and_records
    image_crop, class_id = get_image_crop_and_class_id(annotation)
  File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 1425, in get_image_crop_and_class_id
    image = Image.open(f)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/PIL/Image.py", line 2957, in open
    fp.seek(0)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/util/deprecation.py", line 548, in new_func
    return func(*args, **kwargs)
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 137, in seek
    self._preread_check()
  File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/lib/io/file_io.py", line 76, in _preread_check
    self._read_buf = _pywrap_file_io.BufferedInputStream(
tensorflow.python.framework.errors_impl.NotFoundError: /lfs/ampere4/0/brando9/data/mds/mscoco/train2017/000000558840.jpg; No such file or directory

seems I will have to redownload everything right?

brando90 commented 1 year ago

@patricks-lab I re ran all my steps again from scratch and still get the same error. Script I ran. All sanity checks in btw failed and still failed. I think I need you to write your own mscoco from scratch start to finish and test it in the DGX machine since I can't get this one to work. You can start from my script:

# Download Otherwise, you can download train2017.zip and annotations_trainval2017.zip and extract them into mscoco/. eta ~36m.
mkdir -p $MDS_DATA_PATH/mscoco
wget http://images.cocodataset.org/zips/train2017.zip -O $MDS_DATA_PATH/mscoco/train2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip
# both zips should be there, note: downloading zip takes some time
ls $MDS_DATA_PATH/mscoco/
# Extract them into mscoco/ (interpreting that as extracting both there, also due to how th gsutil command above looks like is doing)
# takes some time, but good progress display
unzip $MDS_DATA_PATH/mscoco/train2017.zip -d $MDS_DATA_PATH/mscoco
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco
# two folders should be there, annotations and train2017 stuff
ls $MDS_DATA_PATH/mscoco/
# check jpg imgs are there
ls $MDS_DATA_PATH/mscoco/train2017
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
# says: 118287 for a 2nd time
ls $MDS_DATA_PATH/mscoco/annotations
ls $MDS_DATA_PATH/mscoco/annotations | grep -c .json
# says: 6 for a 2nd time
# move them since it says so in the google NL instructions ref: for moving large num files https://stackoverflow.com/a/75034830/1601580 thanks chatgpt!
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
find $MDS_DATA_PATH/mscoco/train2017 -type f -print0 | xargs -0 mv -t $MDS_DATA_PATH/mscoco
ls $MDS_DATA_PATH/mscoco | grep -c .jpg
# says: 118287 for both
ls $MDS_DATA_PATH/mscoco/annotations/ | grep -c .json
mv $MDS_DATA_PATH/mscoco/annotations/* $MDS_DATA_PATH/mscoco/
ls $MDS_DATA_PATH/mscoco/ | grep -c .json
# says: 6 for both

# NEXT FAILS with missing image
# 2. Launch the conversion script:
python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
  --dataset=mscoco \
  --mscoco_data_root=$MDS_DATA_PATH/mscoco \
  --splits_root=$SPLITS \
  --records_root=$RECORDS

brando90 commented 1 year ago

ok tried the gsutil approach and it fails: https://github.com/google-research/meta-dataset/issues/108 I think I really do need you to try your own version @patricks-lab . Code I ran:

# 1. Download the 2017 train images and annotations from http://cocodataset.org/:
#You can use gsutil to download them to mscoco/:
mkdir -p $MDS_DATA_PATH/mscoco/
cd $MDS_DATA_PATH/mscoco/
mkdir -p train2017
# seems to directly download all files, no zip file needed
gsutil -m rsync gs://images.cocodataset.org/train2017 train2017
# todo should have 118287? number of .jpg files (note no unziping needed)
ls $MDS_DATA_PATH/mscoco/train2017 | grep -c .jpg
# download & extract annotations_trainval2017.zip
gsutil -m cp gs://images.cocodataset.org/annotations/annotations_trainval2017.zip
unzip $MDS_DATA_PATH/mscoco/annotations_trainval2017.zip -d $MDS_DATA_PATH/mscoco
# todo says: 6?
ls $MDS_DATA_PATH/mscoco/annotations | grep -c .json

error:

(mds_env_gpu) brando9~/data/mds/mscoco $ cd ..
(mds_env_gpu) brando9~/data/mds $ rm -rf mscoco/
(mds_env_gpu) brando9~/data/mds $ mkdir -p $MDS_DATA_PATH/mscoco/
(mds_env_gpu) brando9~/data/mds $ cd $MDS_DATA_PATH/mscoco/
(mds_env_gpu) brando9~/data/mds/mscoco $ mkdir -p train2017
(mds_env_gpu) brando9~/data/mds/mscoco $ gsutil -m rsync gs://images.cocodataset.org/train2017 train2017

BucketNotFoundException: 404 gs://images.cocodataset.org bucket does not exist.

brando90 commented 1 year ago

I do prefer if we have the official instructions from google to work, but in the super worst case you can do the next option:

verify first that ur mscoco tfrecords data looks good ls $RECORDS/mscoco/ | grep -c .tfrecords 80 tfrecords it should display.
download your mscoco tfrecords data from the vision cluster to your local computer once you verify 80 tfrecord (or skip this step and do 3 i.e. upload to zenodo directly from the vision cluster somehow)
follow the zenodo instructions to upload your data: https://zenodo.org/deposit?page=1&size=20, make sure you are allowed to upload the amount of data they let you upload. I think if I remember correctly 50GBs is the limit. If not we might have to find a different way to get it to me.

@patricks-lab

brando90 commented 1 year ago

I do prefer if we have the official instructions from google to work, but in the super worst case you can do the next option:

verify first that ur mscoco tfrecords data looks good ls $RECORDS/mscoco/ | grep -c .tfrecords 80 tfrecords it should display.

download your mscoco tfrecords data from the vision cluster to your local computer once you verify 80 tfrecord (or skip this step and do 3 i.e. upload to zenodo directly from the vision cluster somehow)

follow the zenodo instructions to upload your data: https://zenodo.org/deposit?page=1&size=20, make sure you are allowed to upload the amount of data they let you upload. I think if I remember correctly 50GBs is the limit. If not we might have to find a different way to get it to me.

@patricks-lab

perhaps this is better? https://github.com/google-research/meta-dataset/blob/main/meta_dataset/data/tfds/README.md

than uploading to zenodo?

brando90 / pytorch-meta-dataset

mscoco missing img #20