Objectron is a dataset of short, object-centric video clips. In addition, the videos also contain AR session metadata including camera poses, sparse point-clouds and planes. In each video, the camera moves around and above the object and captures it from different views. Each object is annotated with a 3D bounding box. The 3D bounding box describes the object’s position, orientation, and dimensions. The dataset contains about 15K annotated video clips and 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes
Other
2.24k
stars
263
forks
source link
Inconsistency between records and records_shuffled #34
It seems that some tf.records are present in the _recordsshuffled directory but not in records. I believe this is an unintended discrepancy. Essentially, out of 10532 tf.record files in _recordsshuffled only 10448 remain in records. You can investigate the 84 missing records with the following excerpt:
import tensorflow as tf
def fetchFileNames(dir_names):
filepaths = []
for name in dir_names:
filepaths += tf.io.gfile.glob(f"{name}/*")
return filepaths
record_dirs = tf.io.gfile.glob("gs://objectron/v1/records/*")
record_filepaths = fetchFileNames(record_dirs)
shuffled_dirs = tf.io.gfile.glob("gs://objectron/v1/records_shuffled/*")
shuffled_filepaths = fetchFileNames(shuffled_dirs)
assert len(record_filepaths) < len(shuffled_filepaths)
shuffled_filepaths = [fp.replace("_shuffled", "") for fp in shuffled_filepaths]
record_filepaths = set(record_filepaths)
shuffled_filepaths = set(shuffled_filepaths)
missing = shuffled_filepaths - record_filepaths
It seems that some tf.records are present in the _recordsshuffled directory but not in records. I believe this is an unintended discrepancy. Essentially, out of 10532 tf.record files in _recordsshuffled only 10448 remain in records. You can investigate the 84 missing records with the following excerpt:
These are the missing filepaths:
Ideally, the assertion statement in the gist above would fail and the number of records in these two directories in the bucket would be equal.