Open brando90 opened 1 year ago
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14844693.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14867545.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14900342.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14908027.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14909584.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14914945.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14942411.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14975598.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14976759.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n14977188.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n15005577.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n15075141.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n15089258.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n15089803.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n15090065.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n15091473.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n15092227.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n15092751.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
tar: ..//lfs/ampere4/0/brando9/data/mds/ILSVRC2012_img_train/n15102359.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
were the n.tar files extracted to $MDS_DATA_PATH/ILSVRC2012_img_train? If not may need to make sure to move so that all the n.tar files are in it.
(What's your directory structure in $MDS_DATA_PATH/ILSVRC2012_img_train now? eg what files are in MDS_DATA_PATH/ILSVRC2012_img_train/, does it have n*.tar files as well as wordnet.is_a.txt, words.txt?)
If the n.tar files were indeed extracted into $MDS_DATA_PATH/ILSVRC2012_img_train and its still complaining that only a few specific n.tar files missing above, im suspecting that perhaps your tar operation was corrupted and need to be re-run? not sure.
(i will try sometime early to mid-day tmrrw to see if I can replicate the ilsvrc download operations at least partially)
related: https://github.com/brando90/pytorch-meta-dataset/issues/21 number of imgs
were the n.tar files extracted to $MDS_DATA_PATH/ILSVRC2012_img_train? If not may need to make sure to move so that all the n.tar files are in it.
(What's your directory structure in $MDS_DATA_PATH/ILSVRC2012_img_train now? eg what files are in MDS_DATA_PATH/ILSVRC2012_img_train/, does it have n*.tar files as well as wordnet.is_a.txt, words.txt?)
If the n.tar files were indeed extracted into $MDS_DATA_PATH/ILSVRC2012_img_train and its still complaining that only a few specific n.tar files missing above, im suspecting that perhaps your tar operation was corrupted and need to be re-run? not sure.
(i will try sometime early to mid-day tmrrw to see if I can replicate the ilsvrc download operations at least partially)
this is what Im running
# - 1. Download ilsvrc2012_img_train.tar, from the ILSVRC2012 website
# todo: https://gist.github.com/bonlime/4e0d236cf98cd5b15d977dfa03a63643
# todo: https://github.com/google-research/meta-dataset/blob/main/doc/dataset_conversion.md#ilsvrc_2012
# wget TODO -O $MDS_DATA_PATH/ilsvrc_2012
# for imagenet url: https://image-net.org/download-images.php
wget https://image-net.org/data/winter21_whole.tar.gz -O ~/data/winter21_whole.tar.gz
# there should be .gz file
ls ~/data/
# - 2. Extract it into ILSVRC2012_img_train/, which should contain 1000 files, named n????????.tar (expected time: ~30 minutes) ref: https://superuser.com/questions/348205/how-do-i-unzip-a-tar-gz-archive-to-a-specific-destination
mkdir -p ~/data/winter21_whole
tar xf ~/data/winter21_whole.tar.gz -C ~/data/
# (expected time: ~30 minutes)
ls ~/data/winter21_whole
# move the train part (mv src dest)
mkdir -p $MDS_DATA_PATH/ILSVRC2012_img_train/
mv ~/data/winter21_whole/* $MDS_DATA_PATH/ILSVRC2012_img_train/
# check files are there
ls $MDS_DATA_PATH/ILSVRC2012_img_train/
ls $MDS_DATA_PATH/ILSVRC2012_img_train/ | grep -c .tar
# - 3. Extract each of ILSVRC2012_img_train/n????????.tar in its own directory (expected time: ~30 minutes), for instance:
for FILE in $MDS_DATA_PATH/ILSVRC2012_img_train/*.tar;
do
#echo $FILE
mkdir ${FILE/.tar/};
cd ${FILE/.tar/};
tar xvf ../$FILE;
cd ..;
done
# (expected time: ~30 minutes)
ls $MDS_DATA_PATH/ILSVRC2012_img_train/
ls $MDS_DATA_PATH/ILSVRC2012_img_train/ | grep -c .tar
# 5620
ls $MDS_DATA_PATH/ILSVRC2012_img_train/ -1 | grep -v "\.tar$" | wc -l
# 5622
can you at least visually verify the script looks right? @patricks-lab (and run the sanity checks it has e.g. greps etc)
sample ls
ls $MDS_DATA_PATH/ILSVRC2012_img_train/
...
n01821203.tar n02259377 n02708093.tar n03077616 n03433247.tar n03800772 n04173511.tar n04538878 n07756838.tar n10507482 n12122442.tar n12728864
n01822300 n02259377.tar n02708224 n03077616.tar n03433637 n03800772.tar n04174101 n04538878.tar n07756951 n10507482.tar n12124818 n12728864.tar
@patricks-lab what I need for this (deliverable) is the grep commands I ran in my above script :)
new error:
leafs: mandibular notch and angelfish. LCA: entity
I0106 10:08:41.794109 139919327179392 imagenet_stats.py:158] Finegrainedness analysis of test graph using longest paths in finding the lowest common ancestor.
I0106 10:08:42.020792 139919327179392 imagenet_stats.py:191] Stats on the height of the Lowest Common Ancestor of random leaf pairs of the test graph: mean: 3.8368, median: 4.0, max: 8, min: 1
I0106 10:08:42.020935 139919327179392 imagenet_stats.py:202] Proportion of example leaf pairs (out of num_leaf_pairs random pairs) for each height of the LCA of the leaves: {3: 0.3299, 2: 0.081, 4: 0.3554, 6: 0.0522, 5: 0.1395, 7: 0.0244, 8: 0.0092, 1: 0.0084}
I0106 10:08:42.020989 139919327179392 imagenet_stats.py:208] Proportion of example leaf pairs per height whose LCA is the root: {3: 0.8790542588663232, 2: 0.6061728395061728, 4: 0.9425998874507597, 6: 0.9252873563218391, 5: 0.9369175627240144, 7: 0.8811475409836066, 8: 1.0, 1: 0.16666666666666666}
I0106 10:08:42.021031 139919327179392 imagenet_stats.py:212] Examples with different fine-grainedness:
I0106 10:08:42.021075 139919327179392 imagenet_stats.py:218] Examples with height 3:
leafs: press and wiper motor. LCA: device
I0106 10:08:42.021108 139919327179392 imagenet_stats.py:218] Examples with height 3:
leafs: jigsaw, scroll saw, fretsaw and spinet. LCA: device
I0106 10:08:42.021146 139919327179392 imagenet_stats.py:218] Examples with height 2:
leafs: sonograph and convector. LCA: device
I0106 10:08:42.021184 139919327179392 imagenet_stats.py:218] Examples with height 2:
leafs: band and outrigger. LCA: device
I0106 10:08:42.021220 139919327179392 imagenet_stats.py:218] Examples with height 4:
leafs: tenpenny nail and three-dimensional radar, 3d radar. LCA: device
I0106 10:08:42.021264 139919327179392 imagenet_stats.py:218] Examples with height 4:
leafs: body plethysmograph and peg. LCA: device
I0106 10:08:42.021298 139919327179392 imagenet_stats.py:218] Examples with height 6:
leafs: pinion and digital-analog converter, digital-to-analog converter. LCA: device
I0106 10:08:42.021341 139919327179392 imagenet_stats.py:218] Examples with height 6:
leafs: harvester, reaper and hydroelectric turbine. LCA: machine
I0106 10:08:42.021379 139919327179392 imagenet_stats.py:218] Examples with height 5:
leafs: alarm clock, alarm and coelostat. LCA: device
I0106 10:08:42.021414 139919327179392 imagenet_stats.py:218] Examples with height 5:
leafs: drive line, drive line system and transit instrument. LCA: device
I0106 10:08:42.021447 139919327179392 imagenet_stats.py:218] Examples with height 7:
leafs: steam turbine and bicycle wheel. LCA: device
I0106 10:08:42.021480 139919327179392 imagenet_stats.py:218] Examples with height 7:
leafs: Reaumur thermometer and steam turbine. LCA: device
I0106 10:08:42.021514 139919327179392 imagenet_stats.py:218] Examples with height 8:
leafs: flugelhorn, fluegelhorn and Cassegrainian telescope, Gregorian telescope. LCA: device
I0106 10:08:42.021547 139919327179392 imagenet_stats.py:218] Examples with height 8:
leafs: bicycle wheel and Newtonian telescope, Newtonian reflector. LCA: device
I0106 10:08:42.021582 139919327179392 imagenet_stats.py:218] Examples with height 1:
leafs: thermopile and Beckman thermometer. LCA: thermometer
I0106 10:08:42.021614 139919327179392 imagenet_stats.py:218] Examples with height 1:
leafs: bicycle wheel and rowel. LCA: wheel
I0106 10:08:42.035706 139919327179392 convert_datasets_to_records.py:151] Creating ImageNet ILSVRC-2012 specification and records in directory /lfs/ampere4/0/brando9/data/mds/records/ilsvrc_2012...
Traceback (most recent call last):
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 157, in <module>
tf.app.run(main)
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/platform/app.py", line 36, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 153, in main
converter.convert_dataset()
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 610, in convert_dataset
self.create_dataset_specification_and_records()
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 1574, in create_dataset_specification_and_records
assert set_of_directories == set(all_synset_ids), (
AssertionError: self.data_root should contain a directory whose name is the WordNet id of each synset that is a leaf of any split's subgraph.
thoughts? @patricks-lab
here is their instructions: https://github.com/google-research/meta-dataset/blob/main/doc/dataset_conversion.md#ilsvrc_2012
instructions say there are 1000 files after the ziping but there are 19167. Am I downloading the wrong imagenet or do I have the wrong url for imagenet @patricks-lab?
My attempt:
# - 2. Extract it into ILSVRC2012_img_train/, which should contain 1000 files, named n????????.tar (expected time: ~30 minutes) ref: https://superuser.com/questions/348205/how-do-i-unzip-a-tar-gz-archive-to-a-specific-destination
mkdir -p $HOME/data/winter21_whole
tar xf $HOME/data/winter21_whole.tar.gz -C $HOME/data/
# expected time: ~30 minutes & should contain 1000 files, named n????????.tar
ls $HOME/data/winter21_whole | grep -c .tar
# 19167
# count the number of .tar files in current dir (doesn't not work recursively, for that use find)
if [ $(ls $HOME/data/winter21_whole | grep -c "\.tar$") -ne 1000 ]; then
echo "Error: expected 1000 .tar files, found $(ls | grep -c "\.tar$")"
exit 1
f
what do you have/suggest/did?
@patricks-lab can you show me which url you used? is the one I used look right to you?
My current attempt:
# - 2. Extract it into ILSVRC2012_img_train/, which should contain 1000 files, named n????????.tar (expected time: ~30 minutes) ref: https://superuser.com/questions/348205/how-do-i-unzip-a-tar-gz-archive-to-a-specific-destination
mkdir -p $HOME/data/winter21_whole
tar xf $HOME/data/winter21_whole.tar.gz -C $HOME/data/
# expected time: ~30 minutes & should contain 1000 files, named n????????.tar
ls $HOME/data/winter21_whole | grep -c .tar
# 19167
# count the number of .tar files in current dir (doesn't not work recursively, for that use find)
if [ $(ls $HOME/data/winter21_whole | grep -c "\.tar$") -ne 1000 ]; then
echo "Error: expected 1000 .tar files, found $(ls | grep -c "\.tar$")"
exit 1
fi
# to finish extracting into ILSVRC2012_img_train/ you need to move the files
mkdir -p $MDS_DATA_PATH/ILSVRC2012_img_train/
mv $HOME/data/winter21_whole/* $MDS_DATA_PATH/ILSVRC2012_img_train/
# check files are there
ls $MDS_DATA_PATH/ILSVRC2012_img_train/
ls $MDS_DATA_PATH/ILSVRC2012_img_train/ | grep -c .tar
## should still be 1000
#if [ $(ls $MDS_DATA_PATH/ILSVRC2012_img_train | grep -c "\.tar$") -ne 1000 ]; then
# echo "Error: expected 1000 .tar files, found $(ls | grep -c "\.tar$")"
# exit 1
#fi
will try the tdfs one too https://github.com/google-research/meta-dataset/blob/main/meta_dataset/data/tfds/README.md
still doesn't work :(
(mds_env_gpu) brando9~/data/mds/ILSVRC2012_img_train $ python -m meta_dataset.dataset_conversion.convert_datasets_to_records \
> --dataset=ilsvrc_2012 \
> --ilsvrc_2012_data_root=$MDS_DATA_PATH/ILSVRC2012_img_train \
> --splits_root=$SPLITS \
> --records_root=$RECORDS
2023-01-06 22:29:33.957795: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64:/usr/local/cuda-11.7/lib64:
2023-01-06 22:29:33.958147: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
I0106 22:29:34.813540 139874520330880 imagenet_specification.py:807] Attempting to read number of leaf images from /lfs/ampere4/0/brando9/data/mds/records/ilsvrc_2012/num_leaf_images.json...
I0106 22:29:34.816260 139874520330880 imagenet_specification.py:811] Successful.
Traceback (most recent call last):
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 157, in <module>
tf.app.run(main)
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/tensorflow/python/platform/app.py", line 36, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 303, in run
_run_main(main, args)
File "/lfs/ampere4/0/brando9/miniconda/envs/mds_env_gpu/lib/python3.9/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/convert_datasets_to_records.py", line 147, in main
converter = converter_class(
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 527, in __init__
self._init_specification()
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 584, in _init_specification
self._init_data_specification()
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 538, in _init_data_specification
self._create_data_spec()
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/dataset_conversion/dataset_to_records.py", line 1536, in _create_data_spec
specification = imagenet_specification.create_imagenet_specification(
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/data/imagenet_specification.py", line 1002, in create_imagenet_specification
num_images = get_num_spanning_images(spanning_leaves, num_synset_2012_images)
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/data/imagenet_specification.py", line 264, in get_num_spanning_images
num_images[node] = sum([num_leaf_images[l.wn_id] for l in leaves])
File "/afs/cs.stanford.edu/u/brando9/diversity-for-predictive-success-of-meta-learning/meta-dataset/meta_dataset/data/imagenet_specification.py", line 264, in <listcomp>
num_images[node] = sum([num_leaf_images[l.wn_id] for l in leaves])
KeyError: 'n03045698'
@patricks-lab any progress?
don't get it the file is there:
(mds_env_gpu) brando9~/data/mds/ILSVRC2012_img_train $ ls | grep n03045698
n03045698
n03045698.tar
where do I cd when doing: