dotnet / TorchSharp

A .NET library that provides access to the library that powers PyTorch.
MIT License
1.38k stars 179 forks source link

How TorchSharp can address the pain points of ~900 ML.NET Apr2021 survey responses #308

Closed GeorgeS2019 closed 3 years ago

GeorgeS2019 commented 3 years ago

The Apr 2021 ML.NET survey and the result discussions It is clear that NLP is high on priority This means more deep learning NLP use cases e.g. using ML.NET to load pretrained Hugging Face transformer models using OnnxRuntime

TorchSharp is on track! (especially after the recent renaming effort to make the ML.NET import (more straight forwards) the transformer pretrained models in onnx)

Writing an issue to feedback that TorchText is the next step in development after TorchVision Good Job!

image

GeorgeS2019 commented 3 years ago

An illustration (after feedback changes from @NiklasGustafsson) of how similar is TorchSharp codes now to PyTorch!

image

dsyme commented 3 years ago

That comparison is really very cool. Would be good to see F# side-by-side too - have you got a repo hosting that code that some F# folk could help contribute an F# version to?

(The code would be basically identical , var --> let etc.)

GeorgeS2019 commented 3 years ago

@dsyme @NiklasGustafsson

I have both codes properly format using Azure DevOps Server wiki markdown

Please send me back after you convert it to Fsharp. I will provide here the PyTorch vs TorchSharp (F#)


using TorchSharp;

// N is batch size; D_in is input dimension;
// H is hidden dimension; D_out is output dimension.
int N = 64; int D_in = 1000; int H = 100; int D_out = 10;

// Create random Tensors to hold inputs and outputs
var x = torch.randn(N, D_in);
var y = torch.randn(N, D_out);

// Use the nn package to define our model as a sequence of layers. nn.Sequential
// is a Module which contains other Modules, and applies them in sequence to
// produce its output. Each Linear Module computes output from input using a
// linear function, and holds internal Tensors for its weight and bias.
var model = torch.nn.Sequential(
            ("lin1", Linear(D_in, H)),
            ("relui",ReLU()),
            ("lin2",Linear(H, D_out))
           );

// The nn package also contains definitions of popular loss functions; in this
// case we will use Mean Squared Error (MSE) as our loss function.
var loss_fn = mse_loss(Reduction.Sum);

var learning_rate = 1e-4f;
for(int t = 0; t < 500; t++)
{   // Forward pass: compute predicted y by passing x to the model. Module objects
    // override the __call__ operator so you can call them like functions. When
    // doing so you pass a Tensor of input data to the Module and it produces
    // a Tensor of output data.
    var y_pred = model.forward(x);

    // Compute and print loss. We pass Tensors containing the predicted and true
    // values of y, and the loss function returns a Tensor containing the loss.
    var loss = loss_fn(y_pred, y);
    if( t % 100 == 99){
        Console.WriteLine(string.Format("step: {0} loss: {1}",t+1, loss.ToSingle()));
    }
    // Zero the gradients before running the backward pass.
    model.zero_grad();

    // Backward pass: compute gradient of the loss with respect to all the learnable
    // parameters of the model. Internally, the parameters of each Module are stored
    // in Tensors with requires_grad=True, so this call will compute gradients for
    // all learnable parameters in the model.
    loss.backward();

    // Update the weights using gradient descent. Each parameter is a Tensor, so
    // we can access its gradients like we did before.
    using (var noGrad = new AutoGradMode(false)) { 
        foreach (var param in model.parameters()) { 
            param.sub_(param.grad()*learning_rate);
        }
    }
}
saint4eva commented 3 years ago

Both looks like python to me. I think we should respect .NET idiosyncrasies and naming convention. These libraries are to be used by millions of .NET developers - so should lean towards .NET culture and sentiments. And not to please a few python developers who will be expected to use ML.NET.

GeorgeS2019 commented 3 years ago

@saint4eva The SciSharp community (e.g. Tensorflow.NET, NumpSharp etc. ) promotes "python like" naming to encourage more .NET developers to have access to Deep Learning => one of the pain points of ML.NET users (based on April 2021 survey)

There has been months of discussion on this naming topics for TorchSharp.

Therefore, the primary naming goal (for both Tensorflow.NET and TorchSharp ) is to empower NET developers have access to the deep learning development still missing in ML.NET (according to the survey) - instead of pleasing a few python developers.

==> Most important! It is hope that ONCE more and more .NET developers are doing the type of needed deep learning for .NET community, they will contribute to address the missing deep learning examples (according to the survey) in ML.NET.

dsyme commented 3 years ago

@saint4eva All deep learning architectures are originally implemented in Python. Pretty much all deep learning is done in Python. The important thing to optimise is the efficiency of moving deep learning architectures, model implementations, optimizers, data-loading, training loops etc. to .NET, so you can get on with training. It's not about "pleasing a few Python devs" but rather the massive collection of assets that are available in python. For example look at all the Huggingface transformer implementations. There are ~100 there. Those are the things that we need to optimise bringing over.

This is just not like other .NET APIs.

lostmsu commented 3 years ago

@dsyme I only used TorchSharp for a little bit, but as a mainly C# developer I wholeheartedly agree with @saint4eva . The tooling around C# is made according to .NET class library design guidelines, and deviating from it too far makes powerful things useless. For example, discoverability is really bad for NN layers, because you need to use factory methods to create them. And then you need to also know where to find them.

This issue is caused by copying Python API verbatim: Python has module-level functions, and torch.nn is a module that has both classes and functions. In .NET torch.nn is an OK class to host functions (maybe .NET-style PascalCase would be better). But nesting classes into torch.nn is extremely counter-intuitive. C# tooling does global class search, but I think nested classes are excluded, and create unnecessary torch.nn. prefixes if found with most tools.

NiklasGustafsson commented 3 years ago

I have been very torn about this for several months. I love .NET and its naming conventions, and I agree that the aesthetics of a language are important to the community that uses it.

The driving reason behind our thinking is that 99% of all deep learning texts are relying on Python. TorchSharp is not so much about winning Python developers over to .NET, I don't think that is realistic. The purpose is to make the learning curve faster for .NET developers who are getting into deep learning.

The SciSharp community has already pioneered the idea of staying true to the Python naming conventions, in order to allow users to more readily take advantage of existing texts as guidance. It's not quite copy-and-paste at the moment, and it can never really be, but it's darned close.

We are planning to integrate TorchSharp into ML.NET, just as TF.NET is already integrated. When we do this, the higher-level APIs will (as per current thinking) follow .NET naming conventions and ML.NET patterns. We believe that the vast majority of .NET developers will want to rely on the higher-level APIs rather than the 'hard-core' TorchSharp APIs.

GeorgeS2019 commented 3 years ago

Both Tensorflow.NET and TorchSharp face the challenge of keeping up with the rapid progress in latest AI developed using Tensorflow and PyTorch respectively. The goal is to enable and empower .NET developers to have access to the latest AI development while embracing the .NET6 enterprise/mobile devops rapid development cycle.

As @dsyme said, this requires efficient migration of the latest AI concepts, libraries/frameworks developed in python to .NET environment.

As @NiklasGustafsson said, this will grow and engage the .NET community in the latest AI concepts. Once we are ON TRACK to have a .NET ML community in the latest AI as active as that of the python community, the focus will be to make the latest AI available in ML.NET in a higher-level APIs with the primary focus => to make the difficult latest AI simpler to implement through ML.NET!. This is HOW I see Microsoft should democratize AI/ML! => make very hard AI/ML simpler for .NET developers.

lostmsu commented 3 years ago

@NiklasGustafsson I think the specific problem I mentioned can be solved without contradictions. I don't see a good reason to have classes nested (my biggest pet peeve). Can't we have torch.nn.Sequential factory and TorchSharp.NN.Sequential class instead of

Took a look at the new naming scheme, looks like the nested class issue is not present. But, for instance, Sequential class is missing a public constructor.

The problem that I see with "majority developers will want to rely on C#-style high-level APIs" is that the minority that would want to build on low-level TorchSharp itself would be appalled by its internals and coding style, which would stagnate the project.

As for the people reading deep learning books, it should not be too hard to also read and memorize 3-5 simple rules of how to find corresponding TorchSharp members. The argument seems moot here.

Besides, why not follow C++ API? C++ is much closer to C# and the PyTorch team itself admits it is more polished than Python version.

dsyme commented 3 years ago

which would stagnate the project.

This project will succeed if it's seen as making .NET viable in the PyTorch ecosystem (and that brings enough value to enough deep learning practitioners, or allows enough .NET people to play in the ecosystem). This project is not trying to create an independent .NET machine learning ecosystem.

by its internals...coding style

I presume you mean API style, so I'll answer that - if there are specific problems with the internals or coding style please let us know.

Regarding API style - I don't think so, and to be honest we've made the decision and we're moving on with the project. One day perhaps we'll revisit it, or perhaps someone will wrap this library. Here are some further observations for you:

  1. The C# developers who are WeddedToPascalCase are not people making deep learning models (or if they are, they are doing it in Python).

  2. Almost everyone interested in this project will be following PyTorch documentation and examples at some point. That means knowing PyTorch naming (whether C++ or Python)

  3. I don't know anyone who seriously thinks PascalCaseNaming brings advantages to mathematical tensor programming. No one except .NET people will ever do it. There are some good libraries that have taken a lot of time to map names across into the .NET world, and I don't mean to disparage them (e.g. Math.NET Numerics, ExtremeOptimization). But those aren't shallow wrapper libraries like this one, and, crucially, those aren't trying to bring .NET into an existing ecosystem.

    Take just one example mvlgamma. What are we seriously going to use to follow C# naming conventions? The pointless capitalization MvlGamma? Or the impossible to remember MultivariateLogGamma? Neither are an improvement - people abbreviate this stuff for a reason. To be honest the Pytorch naming conventions including C++ follow the conventions of the mathematical programming universe and use lowercase, with many abbreviations.

  4. Microsoft itself looked at TensorFlow.NET and said "yes, we'll rely on that for ML.NET". Despite the API design. Because it brings canonical value, and because the decisions made sense.

As for the people reading deep learning books, it should not be too hard to also read 3-5 and memorize simple rules of how to find corresponding TorchSharp members. The argument seems moot here.

Everytime I've tried to use a .NET math library I spend hours finding what I need to translate samples, all of it entirely unnecessary. There are literally hundreds or thousands of extra names, words and namespaces you need to know. mvlgamma is a good example, but looking through Tensor.cs there are many other examples.

In any case, this is open source, and you're welcome to fork - or better start a project that sits on top of this one and wraps the API with .NET names?

Besides, why not follow C++ API? C++ is much closer to C# and the PyTorch team itself admits it is more polished than Python version.

We do wrap the C++ API, but in terms of naming and API I can't particularly say it's better. e.g. see sparse here: https://github.com/xamarin/TorchSharp/blob/master/src/Native/LibTorchSharp/THSTensor.cpp#L1109

The C++ API is, however, getting better and better and I can see it will get a lot of use for model delivery (though not for original model design/experimentation/authoring). However for model delivery the operational differences are much more important - notably the C++ API has the huge advantage that more optimizations can be performed (I assume), and linking can remove everything that's not needed (where here we are wrapping the massive LibTorch binaries).

lostmsu commented 3 years ago

@dsyme I have no specific preference for case (although, MultivariateLogGamma is much easier to read). But here's a couple of screenshots to illustrate what I believe is an issue (in terms of nested classes and factory methods):

image

This one is from VS 2022 Preview. The autocomplete suggests Sequential, but that is the class, which, as I mentioned above does not have a public constructor. torch.nn.Sequential is completely missing in the dropdown. I must know to add using static TorchSharp.NN.Modules; or using static torch.nn; after the recent PR to find Sequential the factory method.

The same would happen if I tried ReLU or relu (casing does not matter). I would see the class, but not the factory function.

The same situation happens with VS 2019 and ReSharper installed:

image

Another case in point, re: porting existing PyTorch code from Python. Let's take OpenAI spinning up repository. I picked a random file there, that I never saw before: https://github.com/openai/spinningup/blob/master/spinup/algos/pytorch/ppo/core.py

A few screenshots:

image

image

As you can see, the naming is all over the place: torch.nn module is imported as just nn, which in C# would be the most inconvenient, as its technical equivalent is using nn == torch.nn;, that one has to always type in manually - no tool does it unless there's a name conflict.

If you try to port the Actor class as-is you will notice, that regardless of having or not having using nn = torch.nn; this will never compile:

class Actor: nn.Module
{
}

Because Module is not a member of torch.nn. But even if it were, you'd need to know to do using static torch or using nn = torch.nn from above for this to work, neither of which would be offered by VS or ReSharper.

lostmsu commented 3 years ago

Admittedly, F# is better in this regard: image But even it does not show any kind of information pertaining to which of the Sequential things you want.

saint4eva commented 3 years ago

@saint4eva The SciSharp community (e.g. Tensorflow.NET, NumpSharp etc. ) promotes "python like" naming to encourage more .NET developers to have access to Deep Learning => one of the pain points of ML.NET users (based on April 2021 survey)

Don't you think that promoting python naming culture to encourage .NET developers would be counter-productive and counter-intuitive?

All I am saying is that whatever we are doing, we should always have it at the back of our mind that we are serving the .NET community/ developers. And mind you that .NET community is more principled than any other community - which naming convention is one of them.

Notwithstanding, thank you for your efforts. I appreciate.

saint4eva commented 3 years ago

which would stagnate the project.

This project will succeed if it's seen as making .NET viable in the PyTorch ecosystem (and that brings enough value to enough deep learning practitioners, or allows enough .NET people to play in the ecosystem). This project is not trying to create an independent .NET machine learning ecosystem.

by its internals...coding style

I presume you mean API style, so I'll answer that - if there are specific problems with the internals or coding style please let us know.

Regarding API style - I don't think so, and to be honest we've made the decision and we're moving on with the project. One day perhaps we'll revisit it, or perhaps someone will wrap this library. Here are some further observations for you:

  1. The C# developers who are WeddedToPascalCase are not people making deep learning models (or if they are, they are doing it in Python).
  2. Almost everyone interested in this project will be following PyTorch documentation and examples at some point. That means knowing PyTorch naming (whether C++ or Python)
  3. I don't know anyone who seriously thinks PascalCaseNaming brings advantages to mathematical tensor programming. No one except .NET people will ever do it. There are some good libraries that have taken a lot of time to map names across into the .NET world, and I don't mean to disparage them (e.g. Math.NET Numerics, ExtremeOptimization). But those aren't shallow wrapper libraries like this one, and, crucially, those aren't trying to bring .NET into an existing ecosystem. Take just one example mvlgamma. What are we seriously going to use to follow C# naming conventions? The pointless capitalization MvlGamma? Or the impossible to remember MultivariateLogGamma? Neither are an improvement - people abbreviate this stuff for a reason. To be honest the Pytorch naming conventions including C++ follow the conventions of the mathematical programming universe and use lowercase, with many abbreviations.
  4. Microsoft itself looked at TensorFlow.NET and said "yes, we'll rely on that for ML.NET". Despite the API design. Because it brings canonical value, and because the decisions made sense.

As for the people reading deep learning books, it should not be too hard to also read 3-5 and memorize simple rules of how to find corresponding TorchSharp members. The argument seems moot here.

Everytime I've tried to use a .NET math library I spend hours finding what I need to translate samples, all of it entirely unnecessary. There are literally hundreds or thousands of extra names, words and namespaces you need to know. mvlgamma is a good example, but looking through Tensor.cs there are many other examples.

In any case, this is open source, and you're welcome to fork - or better start a project that sits on top of this one and wraps the API with .NET names?

Besides, why not follow C++ API? C++ is much closer to C# and the PyTorch team itself admits it is more polished than Python version.

We do wrap the C++ API, but in terms of naming and API I can't particularly say it's better. e.g. see sparse here: https://github.com/xamarin/TorchSharp/blob/master/src/Native/LibTorchSharp/THSTensor.cpp#L1109

The C++ API is, however, getting better and better and I can see it will get a lot of use for model delivery (though not for original model design/experimentation/authoring). However for model delivery the operational differences are much more important - notably the C++ API has the huge advantage that more optimizations can be performed (I assume), and linking can remove everything that's not needed (where here we are wrapping the massive LibTorch binaries).

@dsyme I think being an f# developer, you would not understand the implication and cognitive overload of not sticking to C# naming. Maybe f# looks like python, I can see where your sentiments stem from.

Looking at a codebase and immediately understanding the patterns and API style is an optimisation. Taking decision to promote python to force .NET developers to play in the pytorch ecosystem does not sound well.

Have you asked yourself why Asp.Net Core is quite successful, and community members contributed to it? One can borrow ideas from other ecosystems but subsuming yourself in that ecosystem is not good.

Many ecosystems borrowed ideas from C# or .NET, but they did not copy the culture and idiosyncrasies verbatim - they adapted the ideas to fit their ecosystem.

Anyways, if you already have made your decision I wish you all the best,

GeorgeS2019 commented 3 years ago

Imagine, a few lecturers at Universities from different corners of the world start to teach students on deep learning using TorchSharp (likewise for Tensorflow.NET). The naming change adopted here enable many thousands of students to simply access the abundant PyTorch/Tensorflow education materials (written for python) but start to write deep learning in either c# or f# AND eventually supply the work force companies need to adopt deep learning in .NET environment.

Once we achieve this critical mass of .NET deep learning developers (who will actively contribute to many deep learning .NET repositories in GitHub), we can always revisit this discussion of staying truth to .NET convention later.

FYI => a few of the blogs I read about TorchSharp before the naming change were about the lack of documentation. Both Tensorflow.NET and TorchSharp suffer the lack of resources to make their documentation keeping up with the rapid development in tensorflow and pytorch. The naming change adopted here remove this obstacle.

We are all passionate of .NET AI/ML!

Change requires often some scarify among our belief systems along the way.

NiklasGustafsson commented 3 years ago

@GeorgeS2019 -- can you either distill a distinct set of asks from this issue, and then close it?

GeorgeS2019 commented 3 years ago

@NiklasGustafsson there has been a follow up. If necessary, I will further feedback a set of separate issues in coming months.