Quansight-Labs / numpy.net

A port of NumPy to .Net
BSD 3-Clause "New" or "Revised" License
128 stars 14 forks source link

Can numpy.net convert ndarray to Array type? #43

Closed ChengYen-Tang closed 1 year ago

ChengYen-Tang commented 1 year ago

If ndarray has N dimensions, it will be very troublesome to restore it to T[,,,,,,,,,,,,]. Maybe numpy.net can use implicit operators to convert ndarray to Array type. https://learn.microsoft.com/en-us/dotnet/api/system.array?view=net-7.0 https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/user-defined-conversion-operators

This way I can easily convert ndarray to torch.tensoršŸ¤£ https://github.com/dotnet/TorchSharp/blob/c55e9f535aa46c4dfa104d839019553b693616c0/src/TorchSharp/Tensor/Tensor.Factories.cs#L2654

int[,,] a = new int[3, 5, 7];
Array b = a;
Tensor c = as_tensor(b);
Console.WriteLine(c);

// [3x5x7], type = Int32, device = cpu
KevinBaselinesw commented 1 year ago

I did not get an email on this comment like I usually do. It is lucky that I checked these boards this morning.

Internally, Numpy stores all arrays as single dimension arrays. When you add a shape to it, you are just adjusting some internal values that make it appear to be multi-dimensional.

Are you suggesting that we add something like this to the library? It would have to be expanded to include all 16 datatypes. If we make it work for 5 dimensions, that is 64 separate explicit operators.

Can you think of an easier way to do this?

    public static explicit operator int[,](ndarray nd)
        {
            if (nd.ndim != 2)
                throw new Exception("ndarray does not have 2 dimensions");

            // todo: convert nd to int[,] array

            return new int[,] { { 0 }, { 1 } };
        }

        public static explicit operator int[,,] (ndarray nd)
        {
            if (nd.ndim != 3)
                throw new Exception("ndarray does not have 3 dimensions");

            // todo: convert nd to int[,,] array

            return new int[,,] { { { 1, 2 }, { 3, 4 }, { 5, 6 } } };
        }

        public static explicit operator int[,,,] (ndarray nd)
        {
            if (nd.ndim != 4)
                throw new Exception("ndarray does not have 4 dimensions");

            // todo: convert nd to int[,,,] array

            return new int[,,,] { { { { 1, 2 }, { 3, 4 }, { 5, 6 } } } };
        }

        public static explicit operator double[,,,] (ndarray nd)
        {
            if (nd.ndim != 4)
                throw new Exception("ndarray does not have 4 dimensions");

            // todo: convert nd to double[,,,] array

            return new double[,,,] { { { { 1, 2 }, { 3, 4 }, { 5, 6 } } } };
        }

        public static explicit operator System.Numerics.Complex[,,,] (ndarray nd)
        {
            if (nd.ndim != 4)
                throw new Exception("ndarray does not have 4 dimensions");

            // todo: convert nd to System.Numerics.Complex[,,,] array

            return new System.Numerics.Complex[,,,] { { { { 1, 2 }, { 3, 4 }, { 5, 6 } } } };
        }

The code to call them would look like this:

    [TestMethod]
        public void test_AsMultiDimension()
        {
            ndarray a = np.array(new int[,] { { 0 }, { 1 } } );
            AssertArray(a, new int[,] { { 0 }, { 1 } });

            int[,] b = (int[,])a;
            AssertArray(a, b);

            ndarray c = np.array(new int[,,] { { { 1, 2 }, { 3, 4 }, { 5, 6 } } });
            AssertArray(c, new int[,,] { { { 1, 2 }, { 3, 4 }, { 5, 6 } } });
            int[,,] d = (int[,,])c;
            AssertArray(c, d);

        }
ChengYen-Tang commented 1 year ago

Feels like the feature I want, is a big hassle for the kit Because I am developing a library, other developers pass me an ndarray, I hope numpy.net can directly convert this ndarray into Torch.Tensor, Other developers may send me ndarrays of arbitrary shape.

The behavior I expect is as follows. Thanks

int[,,] a = new int[3, 5, 7];
ndarray array = np.array(a);
Console.WriteLine(ToTensor(array));
//Output [3x5x7], type = Int32, device = cpu

public Torch.Tensor ToTensor(ndarray input)
{
    Array array = input;
    return Torch.as_tensor(array);
}
KevinBaselinesw commented 1 year ago

In the latest release 0.9.83, I modified ndarray.ToArray<T() to now return multi dimensional .NET arrays if the ndarray is shaped. From the examples below, just pass the type T of the array that you expect to get back and then cast the results to your data. It returns System.Array so it needs to be cast.

This should solve the problem you are having I think.

   [TestMethod]
        public void test_ToArray()
        {
            // 2D array tests
            var adata = new int[,] { { 1, 2 }, { 3, 4 }, { 5, 6 } };
            ndarray a = np.array(adata);

            var a1 = (int[,])a.ToArray<int[,]>();
            AssertArray(a, adata);
            AssertArray(a, a1);

            // 3D array test
            var bdata = new int[,,] { { { 14 }, { 13 }, { 12 }, { 11 } }, { { 18 }, { 17 }, { 16 }, { 15 } }, { { 22 }, { 21 }, { 20 }, { 19 } } };
            ndarray b = np.array(bdata);

            var b1 = (int[,,])b.ToArray<int[,,]>();
            AssertArray(b, bdata);
            AssertArray(b, b1);

            // 4D array test
            var cdata = new int[,,,] { { { { 1, 0 }, { 3, 2 } }, { { 5, 4 }, { 7, 6 } } }, { { { 9, 8 }, { 11, 10 } }, { { 13, 12 }, { 15, 14 } } } };
            ndarray c = np.array(cdata);

            var c1 = (int[,,,])c.ToArray<int[,,,]>();
            AssertArray(c, cdata);
            AssertArray(c, c1);

            // 5D array test
            var ddata = new int[,,,,] { { { { { 1, 0 }, { 3, 2 } }, { { 5, 4 }, { 7, 6 } } }, { { { 9, 8 }, { 11, 10 } }, { { 13, 12 }, { 15, 14 } } } } };
            ndarray d = np.array(ddata);

            var d1 = (int[,,,,])d.ToArray<int[,,,,]>();
            AssertArray(d, ddata);
            AssertArray(d, d1);

        }
ChengYen-Tang commented 1 year ago

The problem I have now is that what the developer gave me is ndarray, because I am a library, and I donā€™t know which shape to convert this array into. This array may be int[] or int[,] or int[,,]. Although ndarray has a shape attribute, how can I dynamically convert it to int[] or this int[,,]?

So I want numpy.net to return system.Array, so I can directly give TorchSharp all processing.

// Here are the developers using my library.
int[,,] a = new int[3, 5, 7];
ndarray array = np.array(a);
Console.WriteLine(ToTensor(array));
//Output [3x5x7], type = Int32, device = cpu

public Torch.Tensor ToTensor(ndarray input)
{
    // I am here. 
    // I can know from the dtype that this array is int
    // I can know from the shape that this array shape is (3, 5, 7)
    // But how can I dynamically generate the type of int[3, 5, 7] to convert ndarray
    Array array = input;
    return Torch.as_tensor(array);
}
KevinBaselinesw commented 1 year ago

I just pushed up a new release with ndarray.ToSystemArray() which does the same thing as ToArray() but does not require any kind of template input.

    [TestMethod]
        public void test_ToArray()
        {
            // 2D array tests
            var adata = new int[,] { { 1, 2 }, { 3, 4 }, { 5, 6 } };
            ndarray a = np.array(adata);

            var a1 = (int[,])a.ToArray<int[,]>();
            var a2 = (int[,])a.ToSystemArray();
            AssertArray(a, adata);
            AssertArray(a, a1);
            AssertArray(a, a2);

            // 3D array test
            var bdata = new int[,,] { { { 14 }, { 13 }, { 12 }, { 11 } }, { { 18 }, { 17 }, { 16 }, { 15 } }, { { 22 }, { 21 }, { 20 }, { 19 } } };
            ndarray b = np.array(bdata);

            var b1 = (int[,,])b.ToArray<int[,,]>();
            var b2 = (int[,,])b.ToSystemArray();
            AssertArray(b, bdata);
            AssertArray(b, b1);
            AssertArray(b, b2);

            // 4D array test
            var cdata = new int[,,,] { { { { 1, 0 }, { 3, 2 } }, { { 5, 4 }, { 7, 6 } } }, { { { 9, 8 }, { 11, 10 } }, { { 13, 12 }, { 15, 14 } } } };
            ndarray c = np.array(cdata);

            var c1 = (int[,,,])c.ToArray<int[,,,]>();
            var c2 = (int[,,,])c.ToSystemArray();
            AssertArray(c, cdata);
            AssertArray(c, c1);
            AssertArray(c, c2);

            // 5D array test
            var ddata = new int[,,,,] { { { { { 1, 0 }, { 3, 2 } }, { { 5, 4 }, { 7, 6 } } }, { { { 9, 8 }, { 11, 10 } }, { { 13, 12 }, { 15, 14 } } } } };
            ndarray d = np.array(ddata);

            var d1 = (int[,,,,])d.ToArray<int[,,,,]>();
            var d2 = (int[,,,,])d.ToSystemArray();
            AssertArray(d, ddata);
            AssertArray(d, d1);
            AssertArray(d, d2);

        }
ChengYen-Tang commented 1 year ago

Thank you, this solved my problem perfectly.

ChengYen-Tang commented 1 year ago

I found this exception in the source code. https://github.com/Quansight-Labs/numpy.net/blob/17a7ce837513736b38fe814aacbb7f9213815937/src/NumpyDotNet/NumpyDotNet/Extensions.cs#L993

Maybe can use Array.CreateInstance Array array = Array.CreateInstance(typeof(int), 10, 10); https://learn.microsoft.com/zh-tw/dotnet/api/system.array.createinstance?view=net-7.0#system-array-createinstance(system-type-system-int32()) https://learn.microsoft.com/zh-tw/dotnet/api/system.array.setvalue?view=net-7.0#system-array-setvalue(system-object-system-int32())

KevinBaselinesw commented 1 year ago

are you saying you need to support more than 5 dimensions? I know numpy supports as many as 32 dimensions but I didn't think it was used very frequently. Are you hitting the mentioned exception for an array that is less than 6 dimensions?

ChengYen-Tang commented 1 year ago

I don't know, maybe used my library developer or torch some parameter need?šŸ¤£ I just share a method to make the code concise and flexible. It can also give you feedback when there is a problem.

int[] sizes = new int[] { 10, 10 };

Array array = Array.CreateInstance(typeof(int), sizes);
int[] indexes = new int[array.Rank];

int count = 0;
while (true)
{
    array.SetValue(count, indexes);
    count++;

    for (int i = array.Rank - 1; i >= 0; i--)
    {
        if (indexes[i] < array.GetLength(i) - 1)
        {
            indexes[i]++;
            break;
        }
        else
        {
            indexes[i] = 0;
            if (i == 0)
            {
                return;
            }
        }
    }
}
KevinBaselinesw commented 1 year ago

New release. I have updated ToSystemArray() to support up to 18 dimensions.

KevinBaselinesw commented 1 year ago

I figured out how to use your sample function. It does eliminate several hundred lines of redundant code. See it in the new release.

thank you!

ChengYen-Tang commented 1 year ago

I thank you too