Open stephentoub opened 11 months ago
Tagging subscribers to this area: @dotnet/area-system-numerics-tensors See info in area-owners.md if you want to be subscribed.
Author: | stephentoub |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.Numerics.Tensors` |
Milestone: | 9.0.0 |
Could you please elaborate on the advantages of having these APIs in a BCL rather than in a specialized NuGet package (like numpy in Python)? This could provide a valuable perspective for further discussion.
Could you please elaborate on the advantages of having these APIs in a BCL rather than in a specialized NuGet package
It is a nuget package today. It's currently not part of netcoreapp. If it were to be pulled into netcoreapp as well, it would be because we'd be using it from elsewhere in netcoreapp, e.g. using it from APIs like Enumerable.Average, BitArray.And, ManagedWebSocket.ApplyMask, etc., which we very well may do in the future (that has no impact on it continuing to be available as a nuget package).
Hey @stephentoub,
Would it be possible to expose the low level parts of the API instead of only providing Span versions?
e.g
public static Vector128<float> Log2(Vector128<float> value);
public static Vector256<float> Log2(Vector256<float> value);
public static Vector512<float> Log2(Vector512<float> value);
//...etc.
I did that for a prototype for a similar API and it's working great. One reason to expose these APIs is that you can actually build higher level functions (e.g for tensors, the zoo of the activation functions) and build Span versions on top of them.
These API can then be used for other kind of custom Span batching (not related to tensors), where the packing of the vector is different (e.g 4xfloat chuncked xxxx, yyyy, zzzz
)
Would it be possible to expose the low level parts of the API instead of only providing Span versions?
Yes, but it needs to be its own proposal and cover all 5 vector types (Vector
Yes, but it needs to be its own proposal and cover all 5 vector types (Vector, Vector64/128/256/512)
Cool, I will try to write something.
Would it be possible to expose the low level parts of the API instead of only providing Span versions?
Follow-up, created the proposal #93513
@stephentoub:
If it were to be pulled into netcoreapp as well, it would be because we'd be using it from elsewhere in netcoreapp
if brought to the BCL wouldn't it make sense to rename TensorPrimitives
to lets say ArrayMath
, VectorMath
or VectorPrimitives
. Tensor seems a bit exaggerated for what it does, namely doing some math on arrays.
@msedi that would be a breaking change. Additionally, the intent is to expand it to the full set of BLAS support, so Tensor is a very apt and appropriate name that was already scrutinized, reviewed, and approved by API review
@tannergooding: Sure you right, I was just under the impression that there could be something more primitive. The tensor ist something, lets say higher level whereas the vector/array methods are on a lower level. But I'm completely fine with it whenever I know where to find it,
BTW. When looking at the code and the effort for the TensorPrimitives are there any efforts the JIT will some day manage to do the SIMD unfolding for us?
the JIT will some day manage to do the SIMD unfolding for us?
The JIT is unlikely to get auto-vectorization in the near future as such support is complex and quite expensive to do. Additionally, outside of particular domains, such support does not often light up and has measurable impact to real world apps even less frequently. Especially for small workloads it can often have the opposite effect and slow down your code. In the domains where it does light up, and particularly where it would be beneficial to do, you are often going to get better perf by writing your own SIMD code directly.
It is therefore my opinion that our efforts would be better spent providing APIs from the BCL that provide this acceleration for you. Such as all the APIs on Span<T>
, accelerating LINQ, the new APIs on TensorPrimitives
, etc. It may likewise be beneficial to expose some SIMD infrastructure helpers like we've defined internally for TensorPrimitives
; that is expose some public form of InvokeSpanSpanIntoSpan
and friends, which would allow developers to only worry about providing the inner kernel and to have the rest of the SIMD logic (leading/trailing elements, alignment, unrolling, etc) handled internally. Efforts like ISimdVector<TSelf, T>
also fit the bill of making it simpler for devs to write SIMD code.
@tannergooding : Thanks for the info. That makes sense For our case we wrote source generators to generate all the array primitives, currently with Vector
Remaining work is for .NET 10
Regardless of any additional types we may want to add to
System.Numerics.Tensors
, we would like to expand the set of APIs exposed on theTensorPrimitives
static class in a few ways (beyond the work done in .NET 8 in https://github.com/dotnet/runtime/issues/92219):ConvertXx
,CosineSimilarity
,IndexOfMin
,IndexOfMax
,IndexOfMinMagnitude
,IndexOfMaxMagnitude
TensorPrimitives
CpuMath
class from ML.NET, e.g.Add
(with indices),AddScale
(with indices),DotProductSparse
,MatrixTimesSource
,ScaleAdd
improvement viaAddMultiply
orMultipleAdd
overloads,SdcaL1UpdateDense
,SdcaL1UpdateSparse
, andZeroMatrixItems
(might exist in System.Memory).main
after all of the alignmentMin
,Max
,MinMagnitude
,MaxMagnitude
with relation to NaN handling0
if we want to throw or returnNaN
(we consistently throw today when non-0 is required; ML.NET apparently returns 0?) - @tannergoodingMath{F}
that don't currently have representation onTensorPrimitives
, e.g.CopySign
,Reciprocal{Sqrt}{Estimate}
,Sqrt
,Ceiling
,Floor
,Truncate
,Log10
,Log(x, y)
(with y as both span and scalar),Pow(x, y)
(with y as both span and scalar),Cbrt
,IEEERemainder
,Acos
,Acosh
,Cos
,Asin
,Asinh
,Sin
,Atan
. This unmerged commit has a sketch, but it's out-of-date with improvements that have been made to the library since, and all of the operations should be vectorized.TensorPrimitives
, e.g.BitwiseAnd
,BitwiseOr
,BitwiseXor
,Exp10
,Exp10M1
,Exp2
,Exp2M1
,ExpM1
,Atan2
,Atan2Pi
,ILogB
,Lerp
,ScaleB
,Round
,Log10P1
,Log2P1
,LogP1
,Hypot
,RootN
,AcosPi
,AsinPi
,AtanPi
,CosPi
,SinPi
,TanPi
We plan to update the System.Numerics.Tensors package alongside .NET 8 servicing releases. When there are bug fixes and performance improvements only, the patch number part of the version will be incremented. When there are new APIs added, the minor version will be bumped. For guidance on how we bump minor/major package versions, see this example.