The last few years have seen a rise in novel differentiable graphics layers which can be inserted in neural network architectures. From spatial transformers to differentiable graphics renderers, these new layers leverage the knowledge acquired over years of computer vision and graphics research to build new and more efficient network architectures. Explicitly modeling geometric priors and constraints into neural networks opens up the door to architectures that can be trained robustly, efficiently, and more importantly, in a self-supervised fashion.
At a high level, a computer graphics pipeline requires a representation of 3D objects and their absolute positioning in the scene, a description of the material they are made of, lights and a camera. This scene description is then interpreted by a renderer to generate a synthetic rendering.
In comparison, a computer vision system would start from an image and try to infer the parameters of the scene. This allows the prediction of which objects are in the scene, what materials they are made of, and their three-dimensional position and orientation.
Training machine learning systems capable of solving these complex 3D vision tasks most often requires large quantities of data. As labelling data is a costly and complex process, it is important to have mechanisms to design machine learning models that can comprehend the three dimensional world while being trained without much supervision. Combining computer vision and computer graphics techniques provides a unique opportunity to leverage the vast amounts of readily available unlabelled data. As illustrated in the image below, this can, for instance, be achieved using analysis by synthesis where the vision system extracts the scene parameters and the graphics system renders back an image based on them. If the rendering matches the original image, the vision system has accurately extracted the scene parameters. In this setup, computer vision and computer graphics go hand in hand, forming a single machine learning system similar to an autoencoder, which can be trained in a self-supervised manner.
Tensorflow Graphics is being developed to help tackle these types of challenges and to do so, it provides a set of differentiable graphics and geometry layers (e.g. cameras, reflectance models, spatial transformations, mesh convolutions) and 3D viewer functionalities (e.g. 3D TensorBoard) that can be used to train and debug your machine learning models of choice.
See the install documentation for instructions on how to install TensorFlow Graphics.
You can find the API documentation here.
TensorFlow Graphics is fully compatible with the latest stable release of TensorFlow, tf-nightly, and tf-nightly-2.0-preview. All the functions are compatible with graph and eager execution.
Tensorflow Graphics heavily relies on L2 normalized tensors, as well as having the inputs to specific function be in a pre-defined range. Checking for all of this takes cycles, and hence is not activated by default. It is recommended to turn these checks on during a couple epochs of training to make sure that everything behaves as expected. This page provides the instructions to enable these checks.
To help you get started with some of the functionalities provided by TF Graphics, some Colab notebooks are available below and roughly ordered by difficulty. These Colabs touch upon a large range of topics including, object pose estimation, interpolation, object materials, lighting, non-rigid surface deformation, spherical harmonics, and mesh convolutions.
NOTE: the tutorials are maintained carefully. However, they are not considered part of the API and they can change at any time without warning. It is not advised to write code that takes dependency on them.
Visual debugging is a great way to assess whether an experiment is going in the right direction. To this end, TensorFlow Graphics comes with a TensorBoard plugin to interactively visualize 3D meshes and point clouds. This demo shows how to use the plugin. Follow these instructions to install and configure TensorBoard 3D. Note that TensorBoard 3D is currently not compatible with eager execution nor TensorFlow 2.
Among many things, we are hoping to release resamplers, additional 3D convolution and pooling operators, and a differentiable rasterizer!
Follow us on Twitter to hear about the latest updates!
You may use this software under the Apache 2.0 License.
As part of TensorFlow, we're committed to fostering an open and welcoming environment.
If you use TensorFlow Graphics in your research, please reference it as:
@inproceedings{TensorflowGraphicsIO2019,
author = {Oztireli, Cengiz and Valentin, Julien and Keskin, Cem and Pidlypenskyi, Pavel and Makadia, Ameesh and Sud, Avneesh and Bouaziz, Sofien},
title = {TensorFlow Graphics: Computer Graphics Meets Deep Learning},
year = {2019}
}
Want to reach out? E-mail us at tf-graphics-contact@google.com!