apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

Perl Data Language types are hardcoded and need to not be #20851

Closed mohawk2 closed 2 years ago

mohawk2 commented 2 years ago

Description

@sergeykolychev This code shows hardcoded type-numbers (https://github.com/apache/incubator-mxnet/blob/v1.x/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm#L62-L84):

use constant DTYPE_MX_TO_PDL => {
    0 => 6,
    1 => 7,
    2 => 6,
    3 => 0,
    4 => 3,
    5 => 0,
    6 => 5,
    float32 => 6,
    float64 => 7,
    float16 => 6,
    uint8   => 0,
    int32   => 3,
    int8    => 0,
    int64   => 5
};

but PDL since 2.064 has changed these numbers due to an increased number of integer types, and the floating-point types needing to all be above all the integer types.

This is the same problem as https://github.com/apache/incubator-mxnet/issues/20850 which can be closed.

Error Message

Code below prints [5,0] instead of [5, 9].

To Reproduce

From https://github.com/PDLPorters/pdl/issues/377:

use strict;
use warnings;
use AI::MXNet qw(mx);
use AI::MXNet qw(nd);
my $x = mx->nd->array([5, 9]);
print $x->aspdl;

Steps to reproduce

Run the above Perl code.

What have you tried to solve it?

Instead the code (one way only) should be something like this in order to work on all recent-ish version of PDL (both before and after the added types, though sbyte is new and would need to be either adjusted for lower-versioned PDL, or simply a dependency on PDL 2.064 added):

use constant DTYPE_MX_TO_PDL => {
    0 => PDL::Type->new('float')->enum,
    1 => PDL::Type->new('double')->enum,
# half-precision is not supported by PDL and this would cause chaos
#    2 => ,
    3 => PDL::Type->new('byte')->enum,
    4 => PDL::Type->new('long')->enum,
    5 => PDL::Type->new('sbyte')->enum,
    6 => PDL::Type->new('longlong')->enum,
    float32 => PDL::Type->new('float')->enum,
    float64 => PDL::Type->new('double')->enum,
# half-precision is not supported by PDL and this would cause chaos
#    float16 => ,
    uint8   => PDL::Type->new('byte')->enum,
    int32   => PDL::Type->new('long')->enum,
    int8    => PDL::Type->new('sbyte')->enum,
    int64   => PDL::Type->new('longlong')->enum,
};

Environment

Perl 5.32, CPAN AI::MXNet 1.5, PDL 2.068.

github-actions[bot] commented 2 years ago

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue. Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly. If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.

sergeykolychev commented 2 years ago

@mohawk2 Can you file PR against latest branch of mxnet where perl-package exist. I'll help to merge it in.

mohawk2 commented 2 years ago

@sergeykolychev Wow, that's a quick reply! Is there CI? I don't have the means to easily install MXNet to actually test it before making a PR.

sergeykolychev commented 2 years ago

@mohawk2 yes, there's CI

sergeykolychev commented 2 years ago

@mohawk2 fixing mxnet to older version of PDL is also acceptable.

mohawk2 commented 2 years ago

@sergeykolychev What is the correct approach for float16? Trying to make that be a PDL float ie float32 seems like a recipe for disaster?

zmughal commented 2 years ago

Hi, I'll get a PR ready for this in a few. This is against the https://github.com/apache/incubator-mxnet/tree/v1.x branch?

sergeykolychev commented 2 years ago

@zmughal yes. (or maybe only against 1.9.x ?) @mohawk2 Use PDL::Type->new('float')->enum The code has a special case handling for float16 so nothing will break. You also need to change DTYPE_PDL_TO_MX if you want to be consistent.

mohawk2 commented 2 years ago

My thinking is requiring a minimum version of PDL would be better :-)

sergeykolychev commented 2 years ago

@zmughal @mohawk2 I'd personally would like to see the hardcoded numbers gone (assuming it's backwards compatible with older versions of PDL).

mohawk2 commented 2 years ago

The PDL::Type->new('typename') API has been there for some time (at least 2.019: https://github.com/PDLPorters/pdl/blob/v2.019/Basic/Core/Types.pm.PL#L314). The only issue would be allowing the new sbyte type, but switching that to just byte in my snippet above doesn't seem like it would be an important loss given it's been fine with unsigned up till now.

jehosha commented 2 years ago

Hi guys,

I'm here to confirm that just by editing the file AI/MXNet/Base.pm and just by replacing the following code:

use constant DTYPE_MX_TO_PDL => { 0 => 6, 1 => 7, 2 => 6, 3 => 0, 4 => 3, 5 => 0, 6 => 5, float32 => 6, float64 => 7, float16 => 6, uint8 => 0, int32 => 3, int8 => 0, int64 => 5 };

to

use constant DTYPE_MX_TO_PDL => { 0 => PDL::Type->new('float')->enum, 1 => PDL::Type->new('double')->enum,

half-precision is not supported by PDL and this would cause chaos

2 => ,

3 => PDL::Type->new('byte')->enum,
4 => PDL::Type->new('long')->enum,
5 => PDL::Type->new('sbyte')->enum,
6 => PDL::Type->new('longlong')->enum,
float32 => PDL::Type->new('float')->enum,
float64 => PDL::Type->new('double')->enum,

half-precision is not supported by PDL and this would cause chaos

float16 => ,

uint8   => PDL::Type->new('byte')->enum,
int32   => PDL::Type->new('long')->enum,
int8    => PDL::Type->new('sbyte')->enum,
int64   => PDL::Type->new('longlong')->enum,

};

is enough to solve the issue. The version of PDL I am using is 2.068_06, which is a development version: cpanm --dev PDL It required to remove and reinstall the following two libraries: PDL::VectorValued::Utils (1.0.14) and PDL::CCS::Utils (1.23.17)

cpanm --uninstall PDL::VectorValued::Utils cpanm PDL::VectorValued::Utils

cpanm --uninstall PDL::CCS::Utils cpanm PDL::CCS::Utils

Thank you very much for the support.

The issue is now closed, since we now have a solution :-)

zmughal commented 2 years ago

This can be closed. Merged https://github.com/apache/incubator-mxnet/pull/20852.

jehosha commented 2 years ago

Now I have an issue with DTYPE_PDL_TO_MX. It is not consistent. Here is the definition foun din Base.pm:

use constant DTYPE_PDL_TO_MX => { 6 => 0, 7 => 1, 0 => 3, 3 => 4, 5 => 6 };

However I have value 4 that is not present in the above hash:

*PDL::dtype  = sub { my $variable = shift->type->numval;
                     print "variable=", $variable, "\n";
                     DTYPE_MX_TO_STR->{ DTYPE_PDL_TO_MX->{ $variable } } };

The variable has value == 4, which produces the following error:

Use of uninitialized value in hash element at perl5/lib/perl5/AI/MXNet/Base.pm line 465.

The question is how to define 4 in the hash table DTYPE_PDL_TO_MX?

zmughal commented 2 years ago

@jehosha, that should already be handled because if you look at the merged code here: https://github.com/apache/incubator-mxnet/blob/v1.9.x/perl-package/AI-MXNet/lib/AI/MXNet/Base.pm#L78=, it uses the PDL::Type object to get the needed information.

You can confirm that the correct type is there by running:

$ perl -MPDL -E 'say join "|", grep { $_->enum == 4}  PDL::Types::types()'
long
mohawk2 commented 2 years ago

Agreed - it seems most likely from the reported symptom that @jehosha is not using the fixed version of the code.

jehosha commented 2 years ago

Dear mohawk2,

Thank you for the answer.

Best regards,

Jehosha.

On Mon, Apr 4, 2022 at 9:50 AM mohawk2 @.***> wrote:

Agreed - it seems most likely from the reported symptom that @jehosha https://github.com/jehosha is not using the fixed version of the code.

— Reply to this email directly, view it on GitHub https://github.com/apache/incubator-mxnet/issues/20851#issuecomment-1087655146, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADAA47BHWELKB5X5MGWRLELVDL6ULANCNFSM5M34BX5A . You are receiving this because you were mentioned.Message ID: @.***>

-- Dr. Josafá Pontes Natural language processing computer scientist. Associate professor of the National Polytechnic School of Ecuador