SciSharp / TensorFlow.NET

.NET Standard bindings for Google's TensorFlow for developing, training and deploying Machine Learning models in C# and F#.
https://scisharp.github.io/tensorflow-net-docs
Apache License 2.0
3.25k stars 523 forks source link

SciSharp proposals? #618

Open GeorgeS2019 opened 4 years ago

GeorgeS2019 commented 4 years ago

After the recent 2nd Machine Learning Community Standup - Sept 9th 2020 - Data Science and Machine Learning with SciSharp

It is clear Microsoft has reasonable interests with the progress of SciSharp as a community driven project

I see SciSharp projects key to AI for 3D game engine that use c# as scripting language e.g. Unity3D and MIT license Godot.

As someone who follow closely with Godot, there is a repository Godot Proposals, where we actively contribute to ideas how to improve Godot.

As Microsoft has interest to SciSharp, I would recommend we create a SciSharp.Proposals github repository [https://github.com/SciSharp/SciSharp-proposals]. We will focus on how to streamline the development of SciSharp with perspective on community and Microsoft AI framework integration

SciSharp-Proposals repository will track issues ONE LEVEL above all existing SciSharp projects. The aim of SciSharp proposals is how the different SciSharp projects need to be align with each others, not just as individual project, but with a perspective of within SciSharp as well as potential integration with Microsoft new AI initiative.

It is hope through such new repository, the SciSharp community will be more tightly coordinated and cross SciSharp projects can be more effectively tracked than the existing Gitter SciSharp community platform.

Why?

SciSharp aim is clearly the ultimate .NET AI open source solutions based on existing popular python-based AI framework (e.g. numpy, keras, pytorch, tensorflow).

SciSharp has two approaches (a) C# interface to backend python environment installed with AI framework like pytorch and tensorflow (b) Pure C# framework that by-pass the python libraries but wrap directly to the backend c++ API (e.g. libTorch and tensorflow C API).

The approach (a) has provided many reusable e.g. Keras, numpy, c# classes that make it easier to replace python part of e.g tensorflow in approach (b).

The challenges moving forwards

We need a greater coordination to realize the vision setup by Haiping when creating SciSharp with others

(1) We need to figure out a more semi-automatic and consistent way to "catch up" with how Tensorflow C API evolve. (2) We need to semi-automate the translation of Tensorflow API unit tests to c# version (3) We need to coordinate where to put emphasis, writing tutorials, which part is needed, where are the source of tutorials that need to be ported to SciSharp (4) We need to coordinate how to move forwards with other Tensorflow initiatives listed in Projects

(5) Microsoft is developing more user friendly way to build ML network. The interface is based on Netron. We also may need to discuss how SciSharp can be integrated into such framework

In summary

The proposed NEW REPOSITORY e.g. SciSharp.Proposals will bring together the open source .NET communities on AI, Game, and Microsoft to coordinate the discussion how to prioritize the SciSharp project in a more coordinated way.

GeorgeS2019 commented 4 years ago

@Oceania2018 there was a new request on tensorflowLib 2.x. I took the opportunity to explore a possible agreeable tensorflow C API unit test format that will support Tensorflow.NET developers to have the most up-to-date tensorflow C API wrapper code coverage in Tenssorflow.NET.

Challenges:

  1. Tensorflow team is moving very fast and it is challenging for Tensorflow.NET team to catch up with the speed. Yet, often we have questions how much code coverage the Tensorflow.NET has on the e.g. Tensorflow C API. Perhaps one possible way to move forwards is to team up with the tensorflow team to re-format each release of the tensorflow C API unit tests in ways that are more easily parsed for porting into the Tensorlfow.NET C# codes.

Obviously an automated One To One porting is still not possible.

What I would like to suggest that perhaps in addition to the tensorflow C API unit tests code, there is json or xml file which provide a structure meta data on these tests. This include the documentation of the C API unit tests, the hierarchical arrangement of these Unit Tests, what are the C API that the tests are targeting, what are the expected outcome?

With such structure XML or json files, I believe it is possible for the Tensorflow.NET team to generate an automated c# unit tests for high code coverage ONLY the basic skeleton code of the c# unit tests. The second step will be to determine which C Api tests have been updated. For those that are not updated, if there is existing corresponding C# unit tests, the internal code of c# unit tests will be merged with the skeletal c# unit test codes.

For those with no corresponding c# unit test codes, the automated process will add e.g. Ignore attribute.

The goal is that with each release of Tensorlfow.NET that is now targeting on a monthly basis, there is a reasonable Up-to-date report on the c# code coverage of the tensorflow C API unit test.

I believe, with that, the whole Tensorflow.NET community can better coordinate on HOW TO FILL the skeletal C# unit tests that have yet to be fill with internal c# code to complete the c# unit test.

==> This is a feedback, the decision is up to the Tensorflow.NET management team