aws / smithy-go

Smithy code generators for Go (in development)
Apache License 2.0
172 stars 51 forks source link

enhancement: support range iterators in paginators #530

Open jamestelfer opened 1 month ago

jamestelfer commented 1 month ago

Proposal

With the coming release of Go 1.23, it would be nice for the AWS SDKs to support iterators for the generated *Paginator structs.

This would allow code like the following:

func (svc Service) GetParametersByPath(ctx context.Context, path string) (map[string]string, error) {
    input := &ssm.GetParametersByPathInput{
        Path:           aws.String(path),
        Recursive:      aws.Bool(true),
        WithDecryption: aws.Bool(true),
    }
    result := map[string]string{}

    paginator := ssm.NewGetParametersByPathPaginator(svc.Client, input)

    // iterate over pages
    for paginator.HasMorePages() {
        page, err := paginator.NextPage(ctx)
        if err != nil {
            return nil, err
        }
        // iterate over page values
        for _, p := range page.Parameters {
            result[*p.Name] = *p.Value
        }
    }
    return result, nil
}

... to potentially look like this:

func (svc Service) GetParametersByPath(ctx context.Context, path string) (map[string]string, error) {
    input := &ssm.GetParametersByPathInput{
        Path:           aws.String(path),
        Recursive:      aws.Bool(true),
        WithDecryption: aws.Bool(true),
    }
    result := map[string]string{}

    paginator := ssm.NewGetParametersByPathPaginator(svc.Client, input)

    // operate on the paginator as a stream of values
    for p, err := range paginator.Iterator(ctx) {
        if err != nil {
            return nil, err
        }
        result[*p.Name] = *p.Value
    }

    return result, nil
}

There are some possible areas of interest when interrogating this idea:

Value

Paginators as a concept offer increased simplicity and reliability over the SDK v1 idioms, and iterators extend this further. The intent when using a paginator is to iterate over the set of results found in AWS in an efficient manner. Using an iterator on top of a paginator allows this intent to be more clearly expressed, and reduces the chance of accidental error.

The iterator allows the set of pages to be consumed as a continuous stream of results, as shown above.

Compatibility

There are two overall concerns here:

  1. The need for Go 1.23+ to be present in order to support range functions
  2. Compatibility with existing code

WRT (1), the patterns for supporting range functions are valid in earlier versions, so there is no issue compiling for earlier versions. There just won't be compiler support for using the function as part of a range expression.

If there were concerns about the increase in code size, code for the iterator can be generated into a separate file that has a build tag:

//go:build go1.23

When it comes to existing code, there's no change. This is a purely opt-in feature, there's no impact on existing code.

Feature limitations

In the example above, the iterator implementation means that there is no access available to the ResultMetadata and NextToken fields present in the output struct. In typical usage, however, these fields are rarely accessed. Likewise, the context.Context variable presented to the NextPage() method on the paginator cannot be changed. Again, it's rare that the context would need to change in between calls to NextPage().

At any point if these limitations present an issue to the API user, they can fall back to the more verbose iteration idiom.

Implementation

What follows here is a shape that works with the current RC. I'm fairly happy with how this experiment turned out. The required additional code generation is minimal, and the range function allows the intent to be far clearer. IMHO this is the kind of situation that range funcs were build for.

Some notes on this implementation:

// Service provides a minimal interface to SSM Parameter Store
type Service struct {
    Client ssm.GetParametersByPathAPIClient
}

func (svc Service) GetParametersByPath(ctx context.Context, path string) (map[string]string, error) {
    input := &ssm.GetParametersByPathInput{
        Path:           aws.String(path),
        Recursive:      aws.Bool(true),
        WithDecryption: aws.Bool(true),
    }

    result := map[string]string{}
    paginator := Adapt(ssm.NewGetParametersByPathPaginator(svc.Client, input))

    for p, err := range paginator.Iterator(ctx) {
        if err != nil {
            return nil, err
        }
        result[*p.Name] = *p.Value
    }

    return result, nil
}

/*
 * Not required for the real thing, only for the purpose of getting this example
 * to compile with the existing SDK.
 */

// Adapt a paginator so it can use the additional Iterator method. This allows
// the method to be closer to what would be required to be generated.
func Adapt(paginator *ssm.GetParametersByPathPaginator) *GetParametersByPathPaginator2 {
    return &GetParametersByPathPaginator2{*paginator}
}

// GetParametersByPathPaginator2 is a concrete implementation of the Paginator,
// adapting an existing generated type to add the necessary additional methods.
type GetParametersByPathPaginator2 struct {
    ssm.GetParametersByPathPaginator
}

/*
 * The following need to be added to the code generation in order to enable this functionality
 */

// Iterator returns a function that yields the values of the paginated response.
// Generated with a func return type instead of iter.Seq2 for backwards compatibility
func (p *GetParametersByPathPaginator2) Iterator(
    ctx context.Context,
    options ...func(*ssm.Options),
) func(yield func(types.Parameter, error) bool) {

    return func(yield func(types.Parameter, error) bool) {
        for p.HasMorePages() {
            page, err := p.NextPage(ctx, options...)
            if err != nil {
                if !yield(types.Parameter{}, err) {
                    break
                } else {
                    // iterator consumer has elected to ignore the error and continue
                    continue
                }
            }

            for _, v := range page.Parameters {
                if !yield(v, err) {
                    break
                }
            }
        }
    }
}

Contributing the change

It has been quite some time since I dove into Java, but I could possibly have a go. It's likely a fairly low energy fix for someone who has more insight on the generation logic.