kwojcicki / amazon-dax-client-v3

AWS JS SDK v3 compatible dax client
Apache License 2.0
8 stars 2 forks source link

Throw Errors from AWS SDK or Expose Custom Errors in TypeScript API #9

Open jfw225 opened 3 months ago

jfw225 commented 3 months ago

Firstly, I want to preface this by saying that I'm shocked (but also at the same time not shocked) that AWS hasn't released an official DAX client with V3 support by now, but you guys are awesome for putting this library together.

I run a production code base and have been trying to start using DAX for some of our APIs. In our code base, I have built an abstraction layer to swap between the standard AWS SDK DB client and your DAX client. For the most part, your DAX client has worked well as a drop-in replacement.

The only part that prevents your client from being a complete drop-in replacement (at least for our code base) is that when your client encounters a service error, you throw a custom DaxServiceError instance. There are a lot of places in our code base where we use the instanceof operator to handle certain errors from @aws-sdk/client-dynamodb (e.g. ConditionalCheckFailedException) which obviously wouldn't work with your custom error.

After digging into your source code, I do see that you have already done the hard part in terms of parsing the response to figure out the correct error. Ideally, it would be great if you could reconstruct the SDK errors from @aws-sdk/client-dynamodb. Otherwise, I would be grateful if you guys could add a quick typescript declaration for your custom DaxServiceError and expose the fields that are necessary to reconstruct the AWS SDK errors.

daveharig commented 3 months ago

I had a similar issue, and agree with harmonizing the error handling. As a workaround, I ultimately built a system where it fails over to DDB client if DAX client fails. Then I log the DAX errors and deal with it in the morning. This covers every manor of potential DAX failure including db service, cluster availability, VPC network nightmares, capacity etc. This DAX client is a baby step for me to get my feet wet and offload some high use access patterns for cost and UX speed.

    let output: GetItemCommandOutput;
    if (useDax) {
      try {
        msDebugLog(
          helper,
          "DAX client selected, sending request to DAX cluster"
        );
        output = await new Promise((resolve, reject) => {
          // If daxClient is unable to get the response in 6 seconds, throw error and fallback to dynamoDbClient
          const timeout = setTimeout(() => {
            reject(new Error("DAX request timed out after 6 seconds"));
          }, 6000);
          // If this daxClient gets executed within 6 seconds, then we return the result and do not do a fallback
          daxClient
            .send(new GetItemCommand(getItemParams))
            .then((result) => {
              clearTimeout(timeout);
              resolve(result);
            })
            .catch((error) => {
              clearTimeout(timeout);
              reject(
                new Error(
                  "DAX client send error: " + JSON.stringify(error, null, 2)
                )
              );
            });
        });
      } catch (error) {
        msErrorLog(
          helper,
          "DAX client failed, falling over to DynamoDB client",
          error
        );
        output = await dynamoDbClient.send(new GetItemCommand(getItemParams));
      }
    } else {
      output = await dynamoDbClient.send(new GetItemCommand(getItemParams));
    }
    return output.Item as T | undefined;
  } 
jfw225 commented 3 months ago

Thanks for the quick reply. I'm really glad I was not the only one experiencing VPC network nightmares. I'm working on a simple parser to reconstruct the SDK errors on my end anyway. I'll share it here once I have it working in case it's of any use.

jfw225 commented 3 months ago

Okay here is my rough error parser. In case you want to harmonize the error parsing with the DB SDK, this may serve has a helpful starting point.

Some Notes

DaxServiceError -> Serializable AWS SDK Error

(@yaws/dynamodb/src/core/dynamodb-utils/db.error-utils.ts)

import { CancellationReason } from "@aws-sdk/client-dynamodb";
import {
    SerializableAwsDbError,
    doesDbErrorNeedData,
    getAwsDbErrorNames,
} from "@yaws/dynamodb";
import { assertUnreachable } from "yoomi-shared-js";

/**
 * The DAX client that we use
 * @see https://github.com/kwojcicki/amazon-dax-client-v3
 * throws custom errors. However, sometimes these errors are caught by AWS
 * middleware and converted to a new error object (which loses the fields from
 * the original error). This happens in:
 * @see node_modules/@aws-sdk/client-dynamodb/node_modules/@smithy/middleware-retry/dist-cjs/index.js
 *
 * Thus, this function attempts to parse the error object from the message
 * string. This function is certainly hacky, but we rigorously test it.
 */
export function parseDaxServiceError(
    e: Error,
): SerializableAwsDbError | undefined {
    // The error message is in the format: `DaxServiceError: <aws-sdk-error-name>: ${message}`

    const names = getAwsDbErrorNames();
    type Name = (typeof names)[number];

    const constructOutput = (name: Name): SerializableAwsDbError => {
        if (doesDbErrorNeedData(name)) {
            switch (name) {
                // handle special cases individually
                case "TransactionCanceledException":
                    return {
                        type: "AwsDbError",
                        name,
                        $metadata: {},
                        stack: e.stack,
                        message: e.message,
                        data: {
                            CancellationReasons:
                                _transactCanceledExtractReasonsFromErrorMsg(
                                    e.message,
                                ),
                        },
                    } satisfies SerializableAwsDbError;

                default:
                    return assertUnreachable(name);
            }
        }

        return {
            type: "AwsDbError",
            name,
            $metadata: {},
            stack: e.stack,
            message: e.message,
            data: undefined,
        } satisfies SerializableAwsDbError;
    };

    for (const name of names) {
        const re = new RegExp(`DaxServiceError: ${name}`);

        if (re.test(e.message)) {
            return constructOutput(name);
        }
    }

    // if there was no match, then we return `undefined` to indicate that we
    // could not parse the error
    return undefined;
}

/**
 * We only export this for unit testing purposes.
 */
export function _transactCanceledExtractReasonsFromErrorMsg(
    msg: string,
): CancellationReason[] {
    const regex = /for specific reasons \[([^\]]*)\]/g;

    // get the first match
    const match = regex.exec(msg)?.[1];

    if (!match) {
        console.warn(`No match found for ${msg}`);

        return [];
    }

    return match
        .split(",")
        .map((e) => e.trim())
        .filter((e) => e.length > 0)
        .map((e) => ({ Code: e }));
}

AWS SDK Error Encoder/Decoder (stringify/parse)

import type * as DDB_IMPL from "@aws-sdk/client-dynamodb";
import {
    BackupInUseException,
    BackupNotFoundException,
    ConditionalCheckFailedException,
    ContinuousBackupsUnavailableException,
    DuplicateItemException,
    DynamoDBServiceException,
    ExportConflictException,
    ExportNotFoundException,
    GlobalTableAlreadyExistsException,
    GlobalTableNotFoundException,
    IdempotentParameterMismatchException,
    ImportConflictException,
    ImportNotFoundException,
    IndexNotFoundException,
    InternalServerError,
    InvalidEndpointException,
    InvalidExportTimeException,
    InvalidRestoreTimeException,
    ItemCollectionSizeLimitExceededException,
    LimitExceededException,
    PointInTimeRecoveryUnavailableException,
    PolicyNotFoundException,
    ProvisionedThroughputExceededException,
    ReplicaAlreadyExistsException,
    ReplicaNotFoundException,
    RequestLimitExceeded,
    ResourceInUseException,
    ResourceNotFoundException,
    TableAlreadyExistsException,
    TableInUseException,
    TableNotFoundException,
    TransactionCanceledException,
    TransactionConflictException,
    TransactionInProgressException,
} from "@aws-sdk/client-dynamodb";
import {
    Constructor,
    ExcludeStrict,
    ExtractStrict,
    assertUnreachable,
    isObjectGeneral,
    lazyFactoryFn,
    safeObjectKeys,
} from "yoomi-shared-js";

type DDB = typeof DDB_IMPL;

type AllAwsDbErrors = {
    [K in keyof DDB]: DDB[K] extends Constructor<unknown>
        ? InstanceType<DDB[K]> extends DynamoDBServiceException
            ? // we exclude the DynamoDBServiceException itself
                DynamoDBServiceException extends InstanceType<DDB[K]>
                ? never
                : InstanceType<DDB[K]>
            : never
        : never;
}[keyof DDB];

type AwsDbErrorNames = AllAwsDbErrors["name"];

/**
 * Type mapping between some of the error names and the additional information
 * that they contain.
 */
interface AdditionalErrorInfo {
    TransactionCanceledException: {
        CancellationReasons: DDB_IMPL.CancellationReason[] | undefined;
    };
}

// the error names with additional information (strict exclude enforces that
// the keys of `AdditionalErrorInfo` are actually error names)
type WithData = ExtractStrict<AwsDbErrorNames, keyof AdditionalErrorInfo>;
type WithoutData = ExcludeStrict<AwsDbErrorNames, WithData>;

export interface SerializableAwsDbErrorBase<N extends AwsDbErrorNames, Data> {
    type: "AwsDbError";
    name: N;
    $metadata: DynamoDBServiceException["$metadata"];
    stack: string | undefined;
    message: string;

    /**
     * We intentionally set this to `undefined` as well as `Data` in case that
     * it is missing for some reason. We want this parser to be as robust as
     * possible.
     */
    data: Data | undefined;
}

// we intentionally distribute this union
type SerializableWithData<N extends WithData = WithData> = N extends N
    ? SerializableAwsDbErrorBase<N, AdditionalErrorInfo[N]>
    : never;

// we intentionally do not distribute this union
type SerializableWithoutData<N extends WithoutData = WithoutData> =
    SerializableAwsDbErrorBase<N, undefined>;

export type SerializableAwsDbError =
    | SerializableWithData
    | SerializableWithoutData;

// TODO: ensure that we allow for `DynamoDBServiceException` objects that
// are not in the named error list. however, make sure to prioritize the named
// list.
export function isAwsDbError(e: unknown): e is AllAwsDbErrors {
    return e instanceof DynamoDBServiceException && e.name in ERROR_MAP;
}

export function isSerializableAwsDbError(
    e: unknown,
): e is SerializableAwsDbError {
    return isObjectGeneral<Pick<SerializableAwsDbError, "type">>(e, {
        type: (v, r): v is typeof r => v === ("AwsDbError" satisfies typeof r),
    });
}

export function toSerializableAwsDbError(
    e: AllAwsDbErrors,
): SerializableAwsDbError {
    const base: SerializableAwsDbErrorBase<AwsDbErrorNames, undefined> = {
        type: "AwsDbError",
        name: e.name,
        $metadata: e.$metadata,
        stack: e.stack,
        message: e.message,
        data: undefined,
    };

    if (doesDbErrorNeedData(e.name)) {
        switch (e.name) {
            case "TransactionCanceledException":
                return {
                    ...base,
                    name: e.name,
                    data: {
                        CancellationReasons: e.CancellationReasons,
                    },
                } satisfies SerializableAwsDbError;

            /* istanbul ignore next */
            default:
                return assertUnreachable(e.name);
        }
    }

    return {
        ...base,
        name: e.name,
        data: undefined,
    } satisfies SerializableAwsDbError;
}

export function reconstructAwsDbError(
    s: SerializableAwsDbError,
): AllAwsDbErrors {
    const constructError = <N extends AwsDbErrorNames>(
        name: N,
    ): InstanceType<DDB[N]> => {
        const C = ERROR_MAP[name];

        const output = new C({
            $metadata: s.$metadata,
            message: s.message,
        });

        output.stack = s.stack;

        return output satisfies AllAwsDbErrors as InstanceType<DDB[N]>;
    };

    if (doesDbErrorNeedData(s.name)) {
        switch (s.name) {
            case "TransactionCanceledException":
                return (() => {
                    const e = constructError("TransactionCanceledException");

                    e.CancellationReasons = s.data?.CancellationReasons;

                    return e;
                })();

            /* istanbul ignore next */
            default:
                return assertUnreachable(s.name);
        }
    }

    return constructError(s.name);
}

export const getAwsDbErrorNames: () => AwsDbErrorNames[] = lazyFactoryFn(() => {
    const keys = safeObjectKeys(ERROR_MAP);

    return () => keys;
});

const getAwsDbErrorNamesWithData: () => WithData[] = lazyFactoryFn(() => {
    const keys = safeObjectKeys<AdditionalErrorInfo>({
        TransactionCanceledException: null,
    });

    return () => keys;
});

export function doesDbErrorNeedData(name: AwsDbErrorNames): name is WithData {
    return getAwsDbErrorNamesWithData().includes(name as WithData);
}

// mapping from error name to error constructor
const ERROR_MAP: {
    [E in AllAwsDbErrors as E["name"]]: DDB[E["name"]];
} = {
    ConditionalCheckFailedException,
    BackupInUseException,
    BackupNotFoundException,
    InternalServerError,
    RequestLimitExceeded,
    InvalidEndpointException,
    ProvisionedThroughputExceededException,
    ResourceNotFoundException,
    ItemCollectionSizeLimitExceededException,
    ContinuousBackupsUnavailableException,
    LimitExceededException,
    TableInUseException,
    TableNotFoundException,
    GlobalTableAlreadyExistsException,
    ResourceInUseException,
    TransactionConflictException,
    PolicyNotFoundException,
    ExportNotFoundException,
    GlobalTableNotFoundException,
    ImportNotFoundException,
    DuplicateItemException,
    IdempotentParameterMismatchException,
    TransactionInProgressException,
    ExportConflictException,
    InvalidExportTimeException,
    PointInTimeRecoveryUnavailableException,
    ImportConflictException,
    TableAlreadyExistsException,
    InvalidRestoreTimeException,
    ReplicaAlreadyExistsException,
    ReplicaNotFoundException,
    IndexNotFoundException,
    TransactionCanceledException,
};

Some Utilities from Custom Library (yoomi-shared-js)

This is certainly far less important, but I thought I would include it in case you wanted a complete understanding (and context) of the implementations above.

/**
 * The type representing the constructor of some class `T`.
 */
// eslint-disable-next-line @typescript-eslint/no-explicit-any
export type Constructor<T> = abstract new (...args: any[]) => T;

/**
 * Same as
 * @see Exclude
 * but requires types to be present in `T`.
 */
export type ExcludeStrict<T, U extends T> = Exclude<T, U>;

/**
 * Same as
 * @see Extract
 * but requires types to be present in `T`.
 */
export type ExtractStrict<T, U extends T> = Extract<T, U>;

/**
 * This function throws an error whenever it is called. It is useful for
 * situations where you want a switch statement to be exhaustive. Put this in
 * the default case of the switch statement and call it with the value of the
 * switch statement.
 */
export function assertUnreachable(
    x: never,
    opts?: { getError?: () => Error; msg?: string },
): never {
    if (opts?.getError) {
        throw opts.getError();
    }

    throw new Error(
        opts?.msg ??
            `Assert unreachable. Didn't expect to get here. x: ${JSON.stringify(
                x,
            )}`,
    );
}

/**
 * Lazily executes the given factory function and returns its result.
 *
 * Note: if you assign a field to the returned function, it will not be usable
 * until after the function is called. We could improve this by implementing the
 * returned function wrapper as a proxy that calls the factory function when any
 * fields are accessed.
 */
export function lazyFactoryFn<
    Fn extends (...args: any[]) => any,
    FacArgs extends any[] = any[],
>(factoryFn: (...args: FacArgs) => Fn, ...factoryArgs: FacArgs): Fn {
    let fn: Fn | undefined;

    return function (...args) {
        if (fn === undefined) {
            fn = factoryFn(...factoryArgs);
        }

        return fn(...args);
    } as Fn;
}

export function safeObjectKeys<
    const K extends string,
    const T extends { [KK in K]: unknown },
>(obj: T): (keyof T)[];

export function safeObjectKeys<T extends { [K in keyof T]: unknown }>(
    obj: keyof T extends string ? { [K in keyof T]: unknown } : never,
): (keyof T)[];

/**
 * Same as
 * @see Object.keys
 * but with a type-safe return type.
 */
export function safeObjectKeys<
    K extends string,
    T extends { [KK in K]: unknown },
>(obj: T): (keyof T)[] {
    return Object.keys(obj) as K[];
}

/**
 * A general predicate function that returns `true` if the given object contains
 * all of the keys and runtime types of the given `expected` object.
 */
export function isObjectGeneral<O extends { [K in keyof O]: O[K] }>(
    obj: unknown,
    expected: ObjTypeToObjValidator<O>,
    opts?: {
        getError?: ValidatorGetErrorFn<O>;
    },
): obj is O {
    throw new Error(`Implementation not included`).
}

Note that I didn't include the implementation or types for isObjectGeneral because they are pretty complex.

Test Suite for DaxServiceError -> Serializable AWS SDK Error

import { _transactCanceledExtractReasonsFromErrorMsg } from "../dax.error-parser";

const extractReasons = (str: string) => {
    return _transactCanceledExtractReasonsFromErrorMsg(str).map((r) => r.Code);
};

describe("Extract reasons from error strings", () => {
    test("should extract multiple reasons", () => {
        const input =
            " for specific reasons [ConditionalCheckFailed, ConditionalCheckFailed, ConditionalCheckFailed]";
        expect(extractReasons(input)).toEqual([
            "ConditionalCheckFailed",
            "ConditionalCheckFailed",
            "ConditionalCheckFailed",
        ]);
    });

    test("should extract a single reason", () => {
        const input = " for specific reasons [ConditionalCheckFailed]";
        expect(extractReasons(input)).toEqual(["ConditionalCheckFailed"]);
    });

    test("should handle empty brackets", () => {
        const input = " for specific reasons []";
        expect(extractReasons(input)).toEqual([]);
    });

    test("should handle trailing commas", () => {
        const input =
            " for specific reasons [ConditionalCheckFailed, ConditionalCheckFailed, ]";
        expect(extractReasons(input)).toEqual([
            "ConditionalCheckFailed",
            "ConditionalCheckFailed",
        ]);
    });

    test("should handle strings without reasons correctly", () => {
        const input = "Transaction failed without specific reasons";
        expect(extractReasons(input)).toEqual([]);
    });

    test("should extract reasons mixed with DynamoDB error messages", () => {
        const input = `Transaction failed for specific reasons [TransactionConflict, NetworkFailure], another error for specific reasons [ProvisionedThroughputExceeded]`;
        expect(extractReasons(input)).toEqual([
            "TransactionConflict",
            "NetworkFailure",
        ]);
    });

    test(`with multiple occurrences of the pattern`, () => {
        const input = `
(DaxServiceError: TransactionCanceledException: Transaction cancelled, please refer cancellation reasons for specific reasons [ConditionalCheckFailed, ConditionalCheckFailed, ConditionalCheckFailed]
          at <stack-trace-omitted>) {
        time: 1724800994829,
        code: 'TransactionCanceledException',
        retryable: true,
        requestId: 'L1K0IJ0B2QSIECUBRHIM4H9IVNVV4KQNSO5AEMVJF66Q9ASUAAJG',
        statusCode: 400,
        _tubeInvalid: false,
        waitForRecoveryBeforeRetrying: false,
        _message: 'Transaction cancelled, please refer cancellation reasons for specific reasons [ConditionalCheckFailed, ConditionalCheckFailed, ConditionalCheckFailed]',
        codeSeq: [ 4, 37, 38, 39, 58 ],
        cancellationReasons: [
          {
            Item: null,
            Code: 'ConditionalCheckFailed',
            Message: 'The conditional request failed'
          },
          {
            Item: null,
            Code: 'ConditionalCheckFailed',
            Message: 'The conditional request failed'
          },
          {
            Item: null,
            Code: 'ConditionalCheckFailed',
            Message: 'The conditional request failed'
          }
        ]
      })
        `;

        expect(extractReasons(input)).toEqual([
            "ConditionalCheckFailed",
            "ConditionalCheckFailed",
            "ConditionalCheckFailed",
        ]);
    });
});
jfw225 commented 3 months ago

@daveharig I do also have a few questions as I am wrapping up my dax-proxy implementation. For all of the below, this is in a single-cluster setup with 1 active node.

Transact Writes

I can't seem to get TransactWrite commands to work correctly or reliably. When doing a TransactWrite with a put request, it seems that the write is "eventually consistent" (i.e. if I wait long enough, subsequent reads return expected values). However, when I perform a TransactWrite with a delete request, it does not seem to actually delete the items (maybe it does eventually, but it is definitely not immediate like the standard DB interface is).

My question: Does your client currently support TransactWrites using a TransactWriteCommand? If not, no big deal. I'll just prevent my DAX-enabled tables from using TransactWrites for now.

Queries

This is my DAX client setup:

export function createDaxClient(p: {
    isTestEnv: boolean;
}): DbGeneralInterface<DaxSupportedTable> {
    const dbBase = {
        awsClient: new AWS_DynamoDbClient({
            region: DYNAMO_DB_REGION,
            endpoint: getDaxClusterEndpoint(p),
        }),
    };

    const daxClient = new AmazonDaxClient({
        client: dbBase.awsClient,
        config: {
            connectTimeout: 5000,
        },
    });

    const db = new DynamoDbStandardInterface({
        type: "instance",
        client: daxClient,

        // applying the xray middleware seems to be causing an error (i think
        // that the dax client is not correctly instrumenting the command
        // middleware when it makes the request to the dax cluster)
        noGlobalMiddleware: true,
        isTableSupported: isDaxSupportedTable,
    });

    return db;
}

where

Under the hood, DynamoDbStandardInterface simply calls the send method of AWS_DynamoDbClient with the appropriate command from @aws-sdk/client-dynamodb. More every command other than TransactWrite and QueryCommand works as expected. However, when I try to use QueryCommand, I seem to always get an empty response (for an input that works fine with the standard client).

I noticed that you have paginateScan and paginateQuery methods in your client.

My question: Am I using the QueryCommand incorrectly with your client? Do I need to use the paginateQuery method for all of my queries? If so, I see the startingToken input is of type string | undefined when typically the start key is an object the the AWS attribute-value format. How would I reconcile this? Any examples would be greatly appreciated.