Closed luisquintanilla closed 11 months ago
Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.
Author: | luisquintanilla |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.Numerics` |
Milestone: | - |
what about one hot encoder ?
also assuming it can handle sequence of T and not jump straight into vectors
Additional operations to consider if not already available.
Additional operations to consider if not already available.
- DotProduct
- Euclidean Distance
- Normalize (Euclidean Length)
makes total sense!
We start with those and the right dev and testing approach and will be very easy to expand. This is promising
@luisquintanilla, for all of the APIs your proposing, can you clarify on what types you're expecting these to live? Are you envisioning some new static class of extensions? Would these be on the existing MemoryExtensions? Would anything here be on the generic math interfaces? Are there constraints on the Ts involved here? Etc. Or @tannergooding, do you have a vision for this? I can come up with some recommendations if needed, but I wasn't sure if it'd already been worked through and just not included here.
@stephentoub good questions. Here are my thoughts, though I'd defer to @tannergooding since he's much more familiar with these APIs.
what types you're expecting these to live?
I think we can use the Dot
operation as a good example where these might live. Today the Dot
operation is a static method on the Vector type.
https://learn.microsoft.com/dotnet/api/system.numerics.vector.dot?view=net-8.0
However, since we'd be looking to support other types like ReadOnlySpan and System.Collections like array, is Vectors still the best place for them?
Would anything here be on the generic math interfaces...Are there constraints on the Ts involved here?
This is possibly a good opportunity to use Generic Math interfaces here. Although we expect to mostly work with floating point numbers, there are some cases where integers may also be used. I've listed above the types of values we expect:
What these map to in the context of Generic Math interfaces, I'm not sure.
Beyond numeric interfaces, I noticed that there are also function interfaces like IExponentialFunctions and IHyperbolicFunctions which I think satisfy some of the operations proposed here.
Are all of these operands single dimensional, or also 2-dim, n-dim? If the latter, we'll need to also think about how we describe dimension information. I doubt we'll want to use .NET multi-dimensional or jagged arrays (only) since those are not friendly to interop. Samples above show 2-dim jagged arrays.
Samples above show 2-dim jagged arrays.
Which in particular? Skimming again, the inputs I see are all single dim... where there's jagged, they're actually treated as collections of arrays, where the inputs to the functions are all single dim, eg
var input1 = new []{-0.4165, 0.1393, 1.1077}
var input2 = new [] {
new [] {0.5073, -0.2416, 1.7139},
new [] {-0.1245, -0.4960, -0.0715}
}
// Outputs
var output = input2.Select(v => CosineSimilarity(input1,v))
// Outputs
// [0.7694, -0.1568]
I am still pondering the support for double, if we had proper generic support this would have been easier : https://github.com/dotnet/csharplang/discussions/6308#discussioncomment-3688642
where the inputs to the functions are all single dim
I see - I missed that it was using linq over the rows and listing the output as an array instead of IEnumerable. For single dim the specified inputs seem reasonable 👍. I think @colombod mentioned it might be interesting to add some interface as an input.
Have opened https://github.com/dotnet/runtime/issues/89639 since the new design builds on this initial proposal and adjusts it to take into account the overall future direction desired here.
The biggest note is that it separates out the more direct concern of the downlevel support needed from the broader future support for other types and scenarios, which will likely be for modern .NET only and likely generic
Closing as this was superseded by #89639
Backgound and motivation
System.Numerics is a powerful library in .NET that provides support for mathematical operations on vectors, matrices, and complex numbers. Because these are SIMD-enabled types, operations on those types can be optimized. The library already contains a wide range of mathematical operations, but there are some common operations that are missing. In this proposal, we propose the addition of APIs to System.Numerics for the following operations:
By providing a common optimized implementation for these operations, .NET library authors and developers can focus on high-level components of their application.
API proposal
The following are the proposed APIs for the operations listed above.
Additional details:
Cosine Similarity
Cosine similarity is a measure of similarity between two non-zero vectors of an inner product space. It is widely used in natural language processing and information retrieval. The proposed API will take two floating point collections as input and return the cosine similarity between them.
Euclidean Distance
Euclidean distance is a measure of distance between two points in a Euclidean space. It is widely used in mathematics, engineering, and machine learning. The proposed API will take two floating point collections as input and return the euclidean distance between the collections.
Dot Product
Dot product is a mathematical operation that takes two vectors and returns a scalar. It is widely used in linear algebra and machine learning. The proposed API will take two floating point collections as input and return the dot product of the vectors.
Normalize (L2 / Euclidean Length)
Normalize is a mathematical operation that takes a vector and returns a unit vector in the same direction. It is widely used in linear algebra and machine learning. The proposed API will take a vector as input and return the normalized vector.
Softmax
Softmax is a function that takes a collection of real numbers and returns a probability distribution. It is widely used in machine learning and deep learning. The proposed API will take a real number collection as input and return the softmax of the collection.
Sigmoid
Sigmoid is a function that takes a real number and returns a value between 0 and 1. It is widely used in machine learning and deep learning. The proposed API will take a real number collection as input and return the sigmoid of the number for each element in the collection.
Tanh
Tanh is a function that takes a real number and returns a value between -1 and 1. It is widely used in machine learning and deep learning as an activation function. The proposed API will take a real number collection as input and return the tanh for each element in the collection.
Exponential (Exp)
Exponential is a function that takes a real number and returns e raised to the power of the number. It is widely used in mathematics, engineering, and machine learning. The proposed API will take a real number collectionas input and return the exponential for each element in the collection.
API Usage
Cosine Similarity
Euclidean Distance
Dot Product
Normalize
Related: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/SemanticKernel/AI/Embeddings/VectorOperations/NormalizeOperation.cs
Softmax
Sigmoid
Tanh
Exponential (Exp)
Use cases
Semantic Search, Clustering, and Recommendations using Embeddings
Distance metrics are useful when performing search, clustering and recommendations because you want to surface related items. The following is an examples from Semantic Kernel using Cosine Similarity to find similar embeddings from an embedding collection.
https://github.com/microsoft/semantic-kernel/blob/344bc54f329dd342bf9836913cdd96e25bbeb25f/samples/dotnet/kernel-syntax-examples/Example23_ReadOnlyMemoryStore.cs#L107-L138
See the following documents for more details:
Normalize ML Model Outputs
Sometimes, the outputs from a machine learning model have to be post-processed. Some common post-processing operations include calculating the sigmoid or softmax
In this case, the Sigmoid and Softmax functions are being used to:
See the ML.NET ONNX Object Detection tutorial for more details.
Alternative Designs
Today, several of these operations are available in libraries like ML.NET, TorchSharp, Semantic Kernel, and TensorFlow.NET.
These implementations have challenges. In the case of ML.NET, the methods live in an internal class that's not publicly exposed. In the case of TorchSharp and TensorFlow.NET, the size of the library is large and the benefit of consuming the library just for a few functions does not outweigh the downsides of taking on such a large dependency.
The alternative is to implement your own as Semantic Kernel has done which although not hard, it's tedious and focused on functionality.
https://github.com/luisquintanilla/OAIDotnetZeroShotClassification/blob/7e37ba4fa3ff9d41387bbabe2cdd6de303135596/Program.cs#L45-L59
https://github.com/luisquintanilla/mlnet-workshop/issues/16#issuecomment-1374371524
Open Questions
Exp
andTanh
which already exist in the .NET BCL, is there anything else that needs to be done here to support Vectors and other types of collections?Dot
which already exist inSystem.Numerics
is there anything else that needs to be done here to support other types of collections?Risks