Defining a fixed lower bound for unconstrained arrays

ARG-Editor commented 1 year ago

This issue continues an unfinished topic from Ada 2022. This issue was created to fulfill the ARG resolution of November 10, 2022.

Ada supports arrays with arbitrary first indices (e.g., 1980 in "type Ada_Versions is array (1980 .. 2100) of Some_Type"). Subprogram parameters are index-aware, i.e., the subprogram inherits the caller's knowledge about the index. If the intended functionality of the subprogram depends on the first index, this is a benefit of Ada, where many other programming languages need complex workarounds.

But frequently, a subrogram's intended functionality is independent from the first index, that is, it is index-agnostic. In such cases, it is difficult to specify and verify that the subprogram's behaviour never depends on the array index, and proper black box testing in Ada would require a lot of additional test cases, compared to languages with a default first index (Java, C++, ...). This puts Ada's (otherwise well deserved) reputation as a language with exceptional support for the development of safe and secure systems into question.

Example:

The package Ada.Strings.Fixed has several functions of the form

   function Index (Source : String; Pattern : String; [other parameters]) return Natural;

to search for the substring Pattern in the string Source.

I.e., if P is a substring of S then S(Index(S, P) .. P'Length-1) = S. This is index-aware for Source, but index-agnostic for Pattern.

For concreteness, consider

P1: String(1..3) := "abc"; P2: String(Positive'Last-2 .. Positive'Last) := "abc".

Whatever the value of Source is, the functionality requires

Index(Source, P1, ...) = Index(Source, P2, ...),

and proper black box tests should include test cases for this behavior.

AI12-0246-1 was created for this issue. However, no one ever proposed a workable solution and the topic was left out of Ada 2022.

ARG-Editor commented 1 year ago

Here's my thoughts on this idea (I've put it into a separate message to keep it separate from the original topic description).

I'm sympathetic to the concern that it is hard to test array operations thoroughly. It is not unusual to pass a slice (which generally retains its bounds) to a subprogram and have the subprogram break somehow. But this is a problem of implementation.

The e-mail discussion in AI12-0246-1 make it clear that the vast majority of subprograms are "index-agnostic"; users of the subprograms do not need to care about the bounds of an array passed to the subprogram. (The bounds of the parameter(s) do not affect whether the subprogram does what it is supposed to do.) It is only a very few subprograms for which those bounds are significant.

Limiting the bounds of arrays does have the potential to make the implementation of a subprogram, and the testing of that implementation easier. That is, it is primarily relevant within a subprogram body.

Both of these factors means that changing the specification of a subprogram to give this information is generally a bad idea. We don't want information that is irrelevant to users and/or static analysis tools in those specifications. Only the few subprograms for which the bounds matter should have the specifications changed (probably with a precondition or predicate).

Moreover, Ada has always (even in Ada 83) supported ways to re-bound an array inside the body. If we have a subprogram with an unconstrained array parameter:

 procedure Do_It (Param : in out String; ...) is
    -- Decls.
 begin
    -- Operations.
 end Do_It;

If we want the implementation of this routine to only have to deal with Param having a lower bound of one, we can transform it (at the cost of a nested subprogram) as follows:

 procedure Do_It (Arg : in out String; ...) is
    -- Decls.
    subtype Half_Bound is String (1 .. Arg'Length);
    procedure Really_Do_It (Arg : in out Half_Bound; ...) is
    begin
       -- Operations.
    end Really_Do_It;
 begin
    Really_Do_It (Half_Bound(Arg), ...);
 end Do_It;

For an "in" parameter, you have many more options. One that always works in any version of Ada, but might be too expensive for large arrays, is to declare a local copy of the parameter:

 procedure Do_It_Too (Arg : in String; ...) is
    -- Decls.
    Local_Arg : String(1..Arg'Length) := Arg;
 begin
    -- Operations, using Local_Arg rather than Arg.
 end Do_It_Too;

This is what I usually do when I want to avoid dealing with unusual bounds.

In Ada 95, a type conversion alone will do the job. For Ada 95, you have to wrap every use of Arg with a type conversion, but Ada 2022 allows such conversions to be renamed, so you can write:

 procedure Do_It_Too (Arg : in String; ...) is
    -- Decls.
    subtype Half_Bound is String (1 .. Arg'Length);
    My_Arg : Half_Bound renames Half_Bound(Arg);
 begin
    -- Operations, using My_Arg rather than Arg.
 end Do_It_Too;

With all of these techniques, we've avoiding the need to deal with unusual bounds in the implementation of a subprogram. But this has no effect on the user, they can pass any bounds that are convenient.

So adding a feature to Ada can make implementation and testing of that implementation easier, but that already can be done relatively simply. The implementer of the body has to make a decision to use some technique to eliminate unusual bounds in any case, so they have to take some concrete action.

Thus, this is a nice-to-have feature at best. And it can easily turn into more complication for the language and its implementations than it is worth.

              Randy.

sttaft commented 7 months ago

GNAT supports this notion, using syntax of the form:

type A is array(Positive range 1 .. <>) of Integer;

It is described here.

Ada-Rapporteur-Group / User-Community-Input

Defining a fixed lower bound for unconstrained arrays #54