Open pwais opened 4 years ago
@pwais
The data is in proto format. It is not recommended to directly it consume it in your ML model. I think a common practice is to transform the data to the format your model work with. For example tensorflow.Example if tensorflow is your ML framework.
Apart from frame_utils.py that works directly with the proto. Other tf code in the lib does not assume eager mode.
Yup, I imagine most users will have some sort of ETL pipeline, and that pipeline could very well involve TF Graphs (e.g. run deep net inference on Waymo Frames, then record the result). It's very well-documented within the Tensorflow Github issues that enabling Eager mode will very often cause TF Graph code to have unexpected errors (browse e.g. https://github.com/tensorflow/tensorflow/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+eager ). While Eager is enabled by default in Tensorflow 2.0, at the time of writing there are still a large number of outstanding bugs that prohibit projects from upgrading.
Either way, the issue remains that TF Eager is required to, for example, decode the Waymo lidar range images (e.g. to create BEV feature vectors). Since TF Eager by default is irreversible when enabled globally, frame_utils.py
leaks this quite intrusive constraint.
The purpose of this Github issue is to:
By the way, if Waymo did not intend for users to use the released data directly in a training pipeline, it would behoove Waymo to use some data format that is not TFRecords for distribution. While it's unfortunate that Google has not open sourced python/c++ RecordIO (despite Protobuf being open source for now over a decade), there are other options that facilitate significantly more efficient data access with lighter dependencies, including:
In particular, the latter tarball option would not just remove the dependency on tensorflow
but also enable efficient random seeking over the Frame data (e.g. using the Python built-in tarfile
library), which TFRecords notoriously prohibit; this option would be a strict improvement over the current release format.
I think it is not too much trouble to write a binary (say a apache beeam job) that transforms a proto to a format you prefer. Within that binary, you can enable tf eager.
We prefer to providing a generic format that can be easily transformed to other format.
Other bugs observed using TF Eager (Tensorflow 1.14):
Session.run()
call) is 4-8 seconds. I also tried using a GPU (Nvidia 1070 and execution time is on average the same or 1 second faster.os.environ['CUDA_VISIBLE_DEVICES'] = ''
prior to even importing TF Eager. This procedure is essentially necessary for any sort of multi-threaded usage (e.g. Spark or Beam). Any updates on this nearly a year later? I can't believe the challenge came and went and this still hasn't been fixed
It would be tremendously helpful if any future data releases were not in TFRecords format because:
tar
file is a concatenation of index-delimited files. The latter supports random access, compression, and almost every platform today can read tar
files... TFRecords offers none of these features. tf.data
, and the ops for reading protobuf data efficiently are all tensorflow specific. The recent release of KITTI-360 shows yet another example of a AV dataset that uses simple flat files (and zip files for distribution) ... not TFRecords http://www.cvlibs.net/datasets/kitti-360/documentation.php
@pwais Looking at this a year later, since this is an early preprocessing layer, couldn't you just run this part in eager and wrap all your other graph code to a tf.function?
@atyshka No! Not using Tensorflow as a framework in this context, and definitely not the version that waymo-open-dataset requires. I totally understand that some people are fine with Tensorflow eager and such, or are just running in a notebook or something. But a lot of people do not, and in those settings this project's dependencies are obnoxious.
Yeah I agree that this is too rigidly dependent on tensorflow, but things have changed a little. A year or so in to TF 2.0 most maintained projects such as object detection API are fully 2.0 and eager compatible. Still not friendly for non-tensorflow users though, I agree it would be nice to minimize dependency
@atyshka A key problem, which I believe is still here, is that the reference code in this repo for decoding point clouds from the TFRecords is complicated and thus essentially requires Tensorflow. If you look at NuScenes, Argoverse, KITTI, KITTI-360, Lyft Level 5, TRI-ML/DDAD, etc etc -- NONE of these datasets require a giant dependency like Tensorflow for simply reading point cloud data (into a canonical XYZ cloud from their RV encoding). Moreover, when I tried isolating the code in this repo into a python function, it took 7 seconds per cloud on a single CPU to decode. One of the authors gave the (rather snarky) comment that I should use a GPU to decode the lidar data.
While one could say all this is just unfortunate, I think it's important to raise the bar (or at least meet the bar) when introducing yet another AV dataset. Most of the aforementioned datasets were out before this one, and not a single one of those requires Tensorflow, nor patented code (in third_party/camera
in this repo-- careful not to look at that directory at work).
Again, an ideal mitigation would be simply to remove the dependency on Tensorflow. The winner of the Open Dataset Detection challenge uses pytorch: https://github.com/open-mmlab/OpenPCDet They basically had to rewrite most of the code in the waymo-od repo: https://github.com/open-mmlab/OpenPCDet/blob/594962844729353842207b9e3c50f8b2434484f9/pcdet/datasets/waymo/waymo_dataset.py#L19
Does Waymo actually care? The resounding answer here is a very Google-like "we're sorry you feel that way"
I think we have a solution! I've been able to parse the protos without eager mode by using tf.io.decode_proto. Please see this colab notebook for reference on how to do it: https://colab.research.google.com/drive/1ysVjnq0g5lsRMd2bCzB_nK3v3fF1S5g7?usp=sharing
@peisun1115 Would you accept this change to the repository as a PR? I can't think of any negative effects this method would bring, and I think it would make a lot of people happy. I'd like to check though before putting in the effort
Thanks for contributing. I do not have access to your colab. You are welcome to send a PR. We can accept it as long as it doesn't require much maintenance effort and benefits the community. Alternatively, if it is easier for you, you can create a separate repo to host it if you like. We are happy to mention it somewhere in this repo so that users can find it.
Apologies for the difficulty accessing. My modifications are as follows:
protoc waymo_open_dataset/dataset.proto --descriptor_set_out="dataset.desc" --include_imports
This command must be run, and could be added to the compilation docker process.
My simple example for parsing images:
FILENAME = '/content/waymo-od/tutorial/frames'
dataset = tf.data.TFRecordDataset(FILENAME, compression_type='')
@tf.function
def decode_images_from_frame(data):
deserialized_cams = tf.io.decode_proto(
data, "waymo.open_dataset.Frame", ["images"], [tf.string], descriptor_source='/content/waymo-od/dataset.desc')
return tf.io.decode_proto(
deserialized_cams[1][0], "waymo.open_dataset.CameraImage", ["image"], [tf.string], descriptor_source='/content/waymo-od/dataset.desc')[1][0]
for data in dataset:
deserialized_images = decode_images_from_frame(data)
decoded_images = [tf.io.decode_jpeg(tf.reshape(im, ())) for im in tf.unstack(deserialized_images)]
for im in decoded_images:
plt.figure()
plt.imshow(im.numpy())
This would need to be expanded to recursively cover all the fields in the protobuf message.
The only problem is I believe it would have to be a breaking change, because some functions such as parse_range_image_and_camera_projection() expect Frame
objects whereas this approach directly converts serialized protos to tensors instead of hierarchical objects. A possible solution to avoid breaking changes would be providing alternate versions of the affected functions, I.e. one function accepting old Frame
objects and another accepting new TensorFrame
objects.
Issue: waymo-open-dataset requires Eager execution
The official tutorial uses eager mode: https://colab.research.google.com/github/waymo-research/waymo-open-dataset/blob/master/tutorial/tutorial.ipynb
But it doesn't actually make clear that eager mode is unfortunately required. Attempting to use the code in this python package *without eager mode can result in crashes, e.g.:
This is very sad, because a ton of Tensorflow code (e.g. that in tensorflow/models, tensorflow/tpu, other 3rd party repos, etc etc) would need major changes in order to work with TF Eager execution. While Eager is the default for TF 2.0, public support for TPUs in TF 2.0 is not yet complete ( https://github.com/tensorflow/tensorflow/issues/24412 ).
In the interest of having a library that is as composable as those written for other datasets (e.g. NuScenes, Lyft Level 5, and Argoverse), this is a request for waymo-open-dataset to follow Tensorflow's own recommendation that code support both eager and graph execution. Moreover, there is a large set of functionality in this repo that does not require Tensorflow at all; this is a further request to isolate away the bits that actually require Tensorflow (e.g. the C++ metrics).
Workaround
In the interim, users can use a
py_func
workaround (see below).This is a demo of getting all lidar points, fused, using
waymo-open-dataset
code confined to its owntf.Session
: