Compiler: Missed optimisation due to incorrect inferred type for pure integer subtraction

michalmuskala commented 6 months ago

Describe the bug The compiler fails to properly optimise integer subtraction by miscalculating the possible value range.

To Reproduce Given the following code:

-module(test).

-export([test/1]).

-define(is_1_to_9(X), X =:= $1; X =:= $2; X =:= $3; X =:= $4; X =:= $5; X =:= $6; X =:= $7; X =:= $8; X =:= $9).
-define(is_0_to_9(X), X =:= $0; ?is_1_to_9(X)).

test(<<B1, B2>>) ->
    integer(B1) + integer(B2).

integer(B) when ?is_0_to_9(B) -> B - $0;
integer(_) -> error(badarg).

compiling with erlc +to_asm test.erl produces an assembly file with following:

{function, test, 1, 2}.
  {label,1}.
    {line,[{location,"test.erl",8}]}.
    {func_info,{atom,test},{atom,test},1}.
  {label,2}.
    {'%',{var_info,{x,0},[accepts_match_context]}}.
    {test,bs_start_match3,{f,1},1,[{x,0}],{x,1}}.
    {bs_get_position,{x,1},{x,0},2}.
    {bs_match,{f,3},
              {x,1},
              {commands,[{ensure_exactly,16},
                         {integer,2,{literal,[]},8,1,{x,2}},
                         {integer,3,{literal,[]},8,1,{x,3}}]}}.
    {allocate,1,4}.
    {move,{x,3},{y,0}}.
    {move,{x,2},{x,0}}.
    {line,[{location,"test.erl",9}]}.
    {call,1,{f,5}}. % integer/1
    {'%',{var_info,{x,0},[{type,{t_integer,{'-inf',9}}}]}}.
    {swap,{y,0},{x,0}}.
    {call,1,{f,5}}. % integer/1
    {'%',{var_info,{x,0},[{type,{t_integer,{'-inf',9}}}]}}.
    {gc_bif,'+',
            {f,0},
            1,
            [{tr,{y,0},{t_integer,{'-inf',9}}},
             {tr,{x,0},{t_integer,{'-inf',9}}}],
            {x,0}}.
    {deallocate,1}.
    return.
  {label,3}.
    {bs_set_position,{x,1},{x,0}}.
    {bs_get_tail,{x,1},{x,0},2}.
    {jump,{f,1}}.

{function, integer, 1, 5}.
  {label,4}.
    {line,[{location,"test.erl",11}]}.
    {func_info,{atom,test},{atom,integer},1}.
  {label,5}.
    {'%',{var_info,{x,0},[{type,{t_integer,{0,255}}}]}}.
    {select_val,{tr,{x,0},{t_integer,{0,255}}},
                {f,7},
                {list,[{integer,48},
                       {f,6},
                       {integer,49},
                       {f,6},
                       {integer,50},
                       {f,6},
                       {integer,51},
                       {f,6},
                       {integer,52},
                       {f,6},
                       {integer,53},
                       {f,6},
                       {integer,54},
                       {f,6},
                       {integer,55},
                       {f,6},
                       {integer,56},
                       {f,6},
                       {integer,57},
                       {f,6}]}}.
  {label,6}.
    {gc_bif,'-',{f,0},1,[{tr,{x,0},{t_integer,{48,57}}},{integer,48}],{x,0}}.
    return.
  {label,7}.
    {move,{atom,badarg},{x,0}}.
    {line,[{location,"test.erl",12}]}.
    {call_ext_only,1,{extfunc,erlang,error,1}}.

The important part is the subtraction call

{gc_bif,'-',{f,0},1,[{tr,{x,0},{t_integer,{48,57}}},{integer,48}],{x,0}}.

which has correctly inferred range of arguments 48..57 - 48. However the resulting value is inferred as -inf..9, rather than 0..9. This prevents the JIT from emitting optimised operations for small integers. This can be seen in:

    {call,1,{f,5}}. % integer/1
    {'%',{var_info,{x,0},[{type,{t_integer,{'-inf',9}}}]}}.

and later

    {gc_bif,'+',
            {f,0},
            1,
            [{tr,{y,0},{t_integer,{'-inf',9}}},
             {tr,{x,0},{t_integer,{'-inf',9}}}],
            {x,0}}.

Expected behavior The emitted operations should be optimised to operate on small integers exclusively. In particular I'd expect to see:

    {call,1,{f,5}}. % integer/1
    {'%',{var_info,{x,0},[{type,{t_integer,{0,9}}}]}}.

and later

    {gc_bif,'+',
            {f,0},
            1,
            [{tr,{y,0},{t_integer,{0,9}}},
             {tr,{x,0},{t_integer,{0,9}}}],
            {x,0}}.

Affected versions Latest master, and OTP 26.2.1

bjorng commented 6 months ago

This is not a bug, but a limitation of the current value range analysis. We will not attempt to improve the value range analysis in Erlang/OTP 27.

Determining ranges is done in two passes. First in the global type analysis pass (meaning looking at all functions in a module), and then in a later local pass that propagates ranges through sequential code in a single function.

When the global pass sees the subtraction of 48 from range 48..57, the conservative safe result is -inf..9. The reason is that the calculation could be called in a loop. If called in a loop, the range 0..9 would not be safe, because the next time through the loop, the value range analysis could be asked to calculate (for instance) -48..9 - 48, and the next time -96..9 - 48 and so on. After an infinite number of iterations, we would end up with the range -inf..9.

The local value range analysis pass is less conservative and will correctly deduce that the range of the difference is 0..9, and if there had been any calculations directly following it in the same function, that range would have been propagated to them. However, the range cannot be propagated to the caller.

bjorng commented 6 months ago

No promises, but a possible goal for the compiler in Erlang/OTP 28 could be to improve the ranges in the following example:

-module(ranges).
-export([fact/1, num_odd/1]).

fact(N) when is_integer(N), 0 =< N, N < 10_000 ->
    fact(N, 1).

%% With a more sophisticated value range analysis, the range of N
%% could be determined to be 0..9999 instead of '-inf'..9999.
fact(0, P) -> P;
fact(N, P) -> fact(N - 1, P * N).

num_odd(L) ->
    num_odd(L, 0).

%% Assuming that a list can contain no more than 288230376151711743
%% elements, the range of N would be 0..288230376151711743 instead of
%% the current 0..'+inf'.
num_odd([H|T], N) ->
    case H rem 2 of
        0 -> num_odd(T, N);
        1 -> num_odd(T, N + 1)
    end;
num_odd([], N) ->
    N.

erlang / otp

Compiler: Missed optimisation due to incorrect inferred type for pure integer subtraction #8157