losttech / Gradient

This repository serves as a public issue tracker and documentation host for Gradient, full TensorFlow binding for .NET
Other
86 stars 4 forks source link

:toc: macro :toc-title: :toclevels: 3 :language: csharp

LostTech.TensorFlow (formerly Gradient)

This repository serves as a public issue tracker and documentation for LostTech.TensorFlow, full TensorFlow binding for .NET

[link=https://www.nuget.org/packages/LostTech.TensorFlow/] image::https://img.shields.io/nuget/v/LostTech.TensorFlow.svg[NuGet]

LostTech.TensorFlow enables .NET developers to use the complete set of TensorFlow APIs from any .NET language.

You can find samples at https://github.com/losttech/Gradient-Samples

Full API documentation is available at https://gradient.docs.losttech.software/

[quote, Char-RNN trained on Shakespeare] He shall speak not reverbering injurance.

Contents

toc::[]

Getting started

Supported platforms

LostTech.TensorFlow is fully supported on Windows, MacOS, and Linux on AMD64/x64 architecture. It might work on other OS/CPU architecture combinations, but will only be supported on the best effort basis.

Installation

Install Python + Tensorflow

Before installing LostTech.TensorFlow, you should ensure, that you have TensorFlow installed and working (or use <<using_packaged_environment,packaged environment>> from NuGet):

. Install Python 3.7 64-bit. If you have Visual Studio 2017+, it is possible to install it as a component. Otherwise, get one from https://www.python.org/downloads/ or your system's package manager . Install TensorFlow 1.15.0 (1.15.1+ release has a https://github.com/tensorflow/tensorflow/issues/36417[critical bug]) using pip (https://www.tensorflow.org/install/[official instructions]): .. Find python executable in the installation directory (VS installs to: C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\python.exe) .. Open command line to the directory, containing python .. Execute .\python -m pip install "tensorflow-gpu==1.15.0" or .\python -m pip install "tensorflow-cpu==1.15.0". GPU acceleration requires matching CuDNN and CUDA 10 installed, see https://www.tensorflow.org/install/gpu#older_versions_of_tensorflow[instructions]. . Check the installation by launching python, and running [source,python]import tensorflow. It should succeed.

Add Nuget package to your project

LostTech.TensorFlow packages are published on https://www.nuget.org/packages/LostTech.TensorFlow/[Nuget]. Nuget page lists the commands, necessary to install the package into your project. For dotnet-based projects CLI command is

[source,powershell]

dotnet add package LostTech.TensorFlow --pre

If using the new package management features of .csproj, this could also be achieved by adding the following line to it:

[source,xml]


See the example project file https://github.com/losttech/Gradient-Samples/blob/master/BasicMath/BasicMath.csproj[here].

First steps

Packages and Namespaces

In most cases, you will need to add using tensorflow; at the beginning of your file. In many cases you will also need using LostTech.TensorFlow; and using numpy;.

[source,csharp]

using numpy; using tensorflow; using LostTech.TensorFlow;

Commercial users will also need to add a https://www.nuget.org/packages/LostTech.Gradient.License.Azure/[licensing package for Pay-As-You-Go]: [link=https://www.nuget.org/packages/LostTech.Gradient.License.Azure/] image::https://img.shields.io/nuget/v/LostTech.Gradient.License.Azure.svg[NuGet]

And configure the key from the https://lt-tf-lic-portal.azurewebsites.net/Home/Subscriptions[Subscriptions Page].

[source,csharp]

using LostTech.TensorFlow; using LostTech.TensorFlow.Licensing; TensorFlowSetup.Instance.UseLicense(new AzureLicense("...Key Goes Here...")); TensorFlowSetup.Instance.EnsureInitialized();

Logging

https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/compat/v1/logging[TensorFlow logging] is separate from LostTech.TensorFlow logging. This section discusses the later.

LostTech.TensorFlow core runtime library is https://www.nuget.org/packages/Gradient.Runtime/[Gradient.Runtime]. To configure runtime logging, set appropriate properties of https://gradient.docs.losttech.software/Runtime/v0.4.2/LostTech.Gradient/GradientLog.htm[`LostTech.Gradient.GradientLog`] static class, e.g.: [source,csharp] GradientLog.OutputWriter = Console.Out;

Selecting Python environment

If you want to use TensorFlow with non-default configuration (e.g. different versions instead of Python 3.7 + TensorFlow 1.15), use one of LostTech.Gradient.GradientEngine.UseEnvironment* methods before accessing any TensorFlow methods to select the desired TensorFlow installation.

We also recommend to explicitly call TensorFlowSetup.Instance.EnsureInitialized() to be able to catch any problems with TensorFlow installation. This is especially important in production systems.

Using packaged environment

[link=https://www.nuget.org/packages/LostTech.TensorFlow.Python/] image::https://img.shields.io/nuget/v/LostTech.TensorFlow.Python.svg[NuGet]

  1. Install NuGet package https://www.nuget.org/packages/LostTech.TensorFlow.Python[LostTech.TensorFlow.Python]
  2. Deploy TensorFlow from the package and configure the engine to use it:

[source,csharp]

var pyEnv = LostTech.TensorFlow.PackagedTensorFlow.EnsureDeployed(DIRECTORY); GradientEngine.UseEnvironment(pyEnv);

Old style TF

Prior to the recent changes, the main way to use TensorFlow was to contstruct a computation graph, and then run it in a session. Most of the existing examples will use this mode.

Constructing compute graph

Graph creation methods are located in the tf class from tensorflow namespace. For example:

[source,csharp]

var a = tf.constant(5.0, name: "a"); var b = tf.constant(10.0, name: "b");

var sum = tf.add(a, b, name: "sum"); var div = tf.div(a, b, name: "div");

Running computation

Next, you need to create a Session to run your graph one or multiple times. Sessions allocate CPU, GPU and memory resources, and hold the states of variables.

NOTE: In GPU mode, TensorFlow will attempt to allocate all the GPU memory to itself at that stage, so ensure you don't have any other programs extensively using it, or https://stackoverflow.com/questions/34199233/how-to-prevent-tensorflow-from-allocating-the-totality-of-a-gpu-memory[turn down TensorFlow memory allocation]

Since TensorFlow sessions hold unmanaged resources, they have to be used with IDisposable pattern: [source,csharp]

var session = new Session(); using(session.StartUsing()) { ...do something with the session... });

Now that you have a Session to work with, you can actually compute the values in the graph:

[source,csharp]

var session = new Session(); using(session.StartUsing()) { Console.WriteLine($"a = {session.run(a)}"); Console.WriteLine($"b = {session.run(b)}"); Console.WriteLine($"a + b = {session.run(sum)}"); Console.WriteLine($"a / b = {session.run(div)}"); };

The full code for this example is available at our https://github.com/losttech/Gradient-Samples/tree/master/BasicMath[samples repository]

Porting Python code to LostTech.TensorFlow + C

In most cases converting Python code, that uses TensorFlow, should be as easy as using C# syntax instead of Python one:

  • add new to class constructor calls: Class() -> new Class().

Its easy to spot class construction vs simple function calls in Python: by convention function names there start with a lower case letter like min, while in class names the first letter is capitalized: Session

  • to pass named paramters, use : instead of =: make_layer(kernel_bias=2.0) -> make_layer(kernel_bias: 2.0)
  • to get a subrange of a Tensor , use <> syntax (if available): tensor[1..-2] -> tensor[1..^3] (when using C# 8 ranges, note, that the right side in C# is INCLUSIVE, while in Python it is EXCLUSIVE). A single element can be addressed as usual: tensor[1]

Names of classes and functions

Generally, LostTech.TensorFlow follows TensorFlow https://www.tensorflow.org/versions/r1.15/api_docs/python/tf[Python API] naming. There are, though, language-based differences:

  • in Python modules (roughly equivalent to namespaces) can directly contain functions. In .NET every function must be a part of some type. For that reason LostTech.TensorFlow exposes static classes, named after the innermost module name to contain module functions and properties (but not classes). For example, Python's tensorflow.contrib.data module has a correspoding C# class tensorflow.contrib.data.data. So an equivalent of Python's tensorflow.contrib.data.group_by_window would be tensorflow.contrib.data.data.group_by_window. This mostly applies to the unofficial APIs.
  • most of the official API's functions and properties (but not classes) are exposed via a special class tensorflow.tf. Combined with using tensorflow; this enables invoking TensorFlow functions as neatly as: tf.placeholder(...), tf.keras.activations.relu(...), etc

there is also a similar class numpy.np for NumPy functions

  • class names and namespaces are mostly the same as in Python API. E.g. https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/Session[`tf.Session] is intensorflownamespace, and can be instantiated vianew tensorflow.Session(...)or simplynew Session(...)withusing tensorflow;`

  • some APIs have multiple aliases, like https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/add[tf.add]. Only one of the aliases is exposed by LostTech.TensorFlow. Usually the shortest one.

  • in case of name conflicts (e.g. C# does not allow both shape property and set_shape method in the same class), one of the conflicting names is exposed with suffix $$_$$. For example: set_shape$$_$$, which should be easy to find in IDE autocomplete list.

  • (very rare) due to the way LostTech.TensorFlow works, non-official classes, functions and properties might be exposed via unexpected namespaces. IDE should be able to help find classes (by suggesting to add an appropriate using namespace;). For functions and properties, one might try to find the class, corresponding to their containing module (see the example with tensorflow.contrib.data above, you could search for the data class). Another less convenient alternative is to use Visual Studio's Object Explorer.

  • (rare) some classes and functions, exposed by TensorFlow might only be exposed as function-typed properties. For example, https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/ConfigProto[`ConfigProto], that is used to configuretf.Sessiondoes not have a correspoing class in LostTech.TensorFlow. To create an instance ofConfigProto, you must call its constructor viaConfigProtoproperty in [title="tensorflow.core.protobuf.config_pb2"]config_pb2class:config_pb2.ConfigProto.CreateInstance()`.

Parameter and return types

LostTech.TensorFlow tries hard to expose statically-typed API, but the underlying TensorFlow code is inherently dynamic. In many cases LostTech.TensorFlow over-generalizes or under-generalizes underlying parameter and return types.

When the parameter type is over-generalized, it simply means you loose a hint as to what can actually be passed. LostTech.TensorFlow's parameter may be IEnumerable<object>, but the function can reject anything except a PythonSet<int>. In these cases you can either refer to the https://www.tensorflow.org/versions/r1.15/api_docs/python/tf[official documentation], or quickly try it, and see if the error you get explains what the function actually expects.

For convenience, any 1D .NET arrays are passed as instances of PythonList<T> by default. This also applies to enumerables produced by System.Linq. This behavior can be turned off using IsEnabled properties in https://gradient.docs.losttech.software/Runtime/v0.4.2/LostTech.Gradient.Codecs/[LostTech.Gradient.Codecs].

Dynamic overloads

TL;DR; when you can't pass something or get InvalidCastException, replace tf.func_name(...) -> tf.func_name_dyn(...), and new Class(...) -> Class.NewDyn(...).

When the parameter or return type is under-generalized, you will not be able to use LostTech.TensorFlow's statically-typed API. A function parameter may say, that it only accepts int and bool, but you know from documentation/sample, that you have to pass a Tensor. Another common example is when LostTech.TensorFlow thinks the parameter must be of a derived class, when a base class would actually also be ok. For example, parameter cell might be of type LSTMCell, but actually you should be able to pass any RNNCell, where class LSTMCell: RNNCell. Do not try converting the value you want to pass to the expected type. It will not work. For these cases LostTech.TensorFlow provides dynamic API alongside statically-typed one.

Every function from original API will have an untyped overload, whose name ends with _dyn. All its parameters intentionally allow anything to be passed (type object). It also returns a dynamic type.

Same applies to properties. For each SomeType property{get;set;} there's a dynamic property_dyn{get;set;}.

Every class with constructors will have an untyped static factory method, named NewDyn, which allows you to call class constructor similar to untyped function overloads in the previos paragraph.

Please, report to this issue tracker, if you have to call dynamic overloads a lot to get your model running. We will try to fix that in the next version.

In some cases even that is not enough. If you need to call a method or access a property of an instance of some class, and that method/property is not exposed by LostTech.TensorFlow, convert the instance to dynamic, and try to call it that way. See https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/types/using-type-dynamic

Passing functions

Many TensorFlow APIs accept functions as parameters. If the parameter type is known to be a function, LostTech.TensorFlow will show it as PythonFunctionContainer[https://gradient.docs.losttech.software/Runtime/v0.4.2/LostTech.Gradient/PythonFunctionContainer.htm].

There are two ways to get an instance of it: pass TensorFlow functions back, or pass a .NET function.

Passing TensorFlow functions back to TensorFlow

TL;DR; suffix your function with _fn.

Most NN layers expect an activation argument, which specifies the neuron activation function. TensorFlow defines many activation functions one would want to use in both modern and old-style APIs. The "original" one is called https://en.wikipedia.org/wiki/Sigmoid_function[sigmoid] as is available as tf.sigmoid. Modern networks often use some variant of https://en.wikipedia.org/wiki/Rectifier_(neural_networks)[ReLU] (tf.nn.relu). You can call both directly like this: tf.sigmoid(tensor), but in most cases you need to pass them to activation parameter as PythonFunctionContainer.

To do that you can simply get a pre-wrapped instance by adding _fn suffix to the function name.

For example: tf.layer.dense(activation: *tf.sigmoid_fn*).

Passing C# functions to TensorFlow

To get an instance of PythonFunctionContainer from a C# function, use static method PythonFunctionContainer.Of<T1, ..., TResult>(func or lambda). You will have to specify function argument types in place of <T1, ..., TResult>.

Python with blocks, C#'s using

TL;DR; replace with new Session(...) as sess: sess.do_stuff() -> [source,csharp]

var session = new Session(...); using (session.StartUsing()) { session.do_stuff(); }

You can also use new Session().UseSelf(sess => sess.DoStuff()).

TensorFlow API, being built on Python, use special enter and exit methods for the same purpose .NET has IDisposable. Problem is: in general they do not map directly to each other. For that reason every LostTech.TensorFlow class, that declares those special methods in TensorFlow, also exposes .Use and .UseSelf methods. In most cases it is easiest to use .UseSelf(self => do_something(self)) as shown in the sample above. However, there might be rare special cases, when .Use(context => do_something(context)) has to be used. The difference is that obj.UseSelf always passes obj back to the lambda, while obj.Use might actually generate a new object of potentially completely different type.

Think of .Use and .UseSelf as LostTech.TensorFlow's best attempt at reproducing using(var session = new Session(...)) {} statement.

Exceptions

Most of TensorFlow exceptions have a counterpart either in LostTech.TensorFlow or in Gradient.Runtime[https://gradient.docs.losttech.software/Runtime/v0.4.2/LostTech.Gradient.Exceptions/].

If TensorFlow throws an exception, that has no counterpart, it will surface as a generic PythonException[https://csharpdoc.hotexamples.com/class/Python.Runtime/PythonException].

NumPy

Since most TensorFlow samples use NumPy, LostTech.TensorFlow includes a limited subset under numpy namespace. It is shipped in a separate package: https://www.nuget.org/packages/LostTech.NumPy/[LostTech.NumPy].

[link=https://www.nuget.org/packages/LostTech.NumPy/] image::https://img.shields.io/nuget/v/LostTech.NumPy.svg[NuGet]

anchor:inheritance[]Custom layers and Model subclassing

NOTE: When subclassing tensorflow.keras.Model, every layer, variable or tensor must be explicitly tracked using this.Track method. See https://github.com/losttech/Gradient-Samples/blob/03aa035080d3a46fe6a4c8dcfd6e8f1b91a414a7/ResNetBlock/ResNetBlock.cs#L19[ResNetBlock sample].

https://www.tensorflow.org/tutorials/customization/custom_layers[The official TensorFlow tutorial]

Recommendations

  • import both tensorflow and numpy namespaces: [source,csharp]

    using tensorflow; using numpy;

tf.placeholder(...); np.array(...);

  • if you extensively use some API set under tf., use using static tf.API_HERE; [source,csharp]

    using static tf.keras; ... var model = models.load_model(...); new Dense(kernel_regularizer: regularizers.l2(...));

  • many LostTech.TensorFlow functions return dynamic. Whenever possible, immediately cast it to the concrete type. It will help to maintain the code. Concrete type is always known at runtime and can be seen in the debugger, or accessed via object.GetType() method. Most methods in tf. usually return Tensor. [source,csharp]

    Tensor hidden = tf.layers.dense(input, hiddenSize, activation: tf.sigmoid_fn);

  • avoid directly using classes in Python.Runtime. They are LostTech.TensorFlow's implementation details, which might be changed in the future major versions.

Known limitations

This section may be outdated

Unloading AppDomain is not supported

LostTech.TensorFlow is incompatible with AppDomain unloading. An attempt to unload an AppDomain where TensorFlow was initialized will lead to a crash in native code.

This is a known problem with Unity editor, which means you can not use LostTech.TensorFlow within the editor. You must skip all TensorFlow code using the https://docs.unity3d.com/ScriptReference/Application-isEditor.html[isEditor] check.

Tips and Tricks

[#csharp8]

C# 8

LostTech.TensorFlow supports the neat indexing feature of C# 8: if you are using Visual Studio 2019, you can set appropriate language level like this in the project file: <LangVersion>8.0</LangVersion>.

Then you can access numpy arrays with the new syntax, for example: arr[3..^4], which means "take a range from element at index 3, that includes all elements until (and including) the element with index 4 (counting from the end of the array)".

Blogs, Blog Posts & 3rd-party Samples

What's new

Release Candidate:

  • replaced expiration with licenses
  • improved typing on many APIs
  • fixed inability to access static settings
  • strongly-typed wrappers for Tensor
  • enhanced ndarray<T>
  • improved exception handling and debugging
  • core runtime components include source and debug symbols
  • LINQ enumerables and 1D .NET arrays are now automatically converted to Python lists for compatibility with bad TensorFlow APIs (can be disabled)

Preview 7:

Preview 6.x:

  • feature: ability to <<inheritance,inherit TensorFlow classes>> (for example, allows to create a custom Keras Model, Callback, Layer, etc)
  • new sample: https://github.com/losttech/Gradient-Samples/tree/master/ResNetBlock[ResNetBlock]
  • feature: TensorFlow classes are properly marshalled when passed back to you from TensorFlow
  • fixed: inability to add items to collections, belonging to TensorFlow classes
  • fixed: crash while enumerating collections without an explicit GIL lock
  • fixed: crash due to use-after-free of TensorFlow objects in marshalling layer
  • fixed: PythonClassContainer<T>.Instance failing for nested classes
  • fixed: params object[] were not passed correctly
  • minor: added np.expand_dims, reduced number of thrown and handled exceptions
  • expires in March 2020

Preview 5.1:

  • improved passing dictionaries
  • setup: optionally specify Conda environment via an environment variable
  • setup: fixed Conda environment autodectection on Linux
  • improved argument types in many places
  • Gradient warnings are now printed to Console.Error by default, instead of Console.Out
  • fixed crashes on dynamic interop and multithreaded enumeration
  • fixed some properties not being exposed https://github.com/losttech/Gradient/issues/4

Preview 5:

Preview 4:

  • MacOS and Ubuntu support (with others possibly working too) on .NET Core
  • documentation included for function and parameter tooltips
  • fixed inability to call static class methods

Preview 3

  • fixed inability to reenter TensorFlow from a callback

Preview 2:

  • dynamically typed overloads, that enable fallback for tricky signatures
  • a common interface for tf.Variable and tf.Tensor
  • enabled enumeration over TensorFlow collection types

FAQ

Why not TensorFlowSharp?

|=== | | TensorFlowSharp | LostTech.TensorFlow

| Load TensorFlow models | |

| Train existing models | |

| Create new models with low-level API | |

| Create new models with high-level API | ✗ |

| Dependencies | TF | TF + Python

| TensorBoard integration | ✗ |

| Estimators | ✗ |

| Dataset manipulation via tf.data | ✗ |

| tf.contrib | ✗ |

| Commercial support | ✗ | |===

Why not TensorFlow.NET?

Incomplete set of functions

TensorFlow.NET does not provide full functionality of TensorFlow. As a result, https://github.com/SciSharp/TensorFlow.NET/issues/352[it is hard to implement] state of the art algorithms for computer vision (YOLOv3) and language processing (GPT and BERT) using TensorFlow.NET, especially from scratch. We have complete LostTech.TensorFlow-based samples for both: https://github.com/losttech/YOLOv4[YOLOv4] and https://github.com/losttech/Gradient-Samples/tree/master/GPT-2[GPT-2] and https://github.com/losttech/Gradient-Samples/blob/master/README.md[many more].

Ghost APIs

TensorFlow.NET goal is to be a reimplementation of TensorFlow in C#. However, as of August 2020 only a small set of APIs actually has implementations. Many functions and classes are defined without bodies and do nothing. The state of specific APIs is not tracked, and that can create a lot of confusion. For example, there is an [line-through]#https://github.com/SciSharp/TensorFlow.NET/blob/master/src/TensorFlowNET.Core/Train/AdamOptimizer.cs[AdamOptimizer]# (they got AdamOptimizer since, but the problem is https://github.com/SciSharp/TensorFlow.NET/blob/c2138b20abc41b19a5e1d3568cdeed87bc1c7369/src/TensorFlowNET.Core/Train/GradientDescentOptimizer.cs#L38[systemic]) class, but it does not actually have any implementation, apart from the constructor, meaning it wont actually use Adam, or work at all.

Performance

LostTech.TensorFlow uses official builds of TensorFlow provided by Google, which are well-optimized. As a result, in a https://github.com/losttech/Gradient-Perf/[simple comparison] (training a CNN) LostTech.TensorFlow is about 18% faster than TensorFlow.NET.

We also support TensorFlow builds, that use other accelerators, such as TPUs in Google cloud, or https://pypi.org/project/tensorflow-rocm[tensorflow-rocm] for AMD GPUs.

Credits

TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.