kwojcicki / amazon-dax-client-v3

AWS JS SDK v3 compatible dax client
Apache License 2.0
5 stars 2 forks source link

paginateScan does not support Dax Client #4

Open paul-uz opened 3 months ago

paul-uz commented 3 months ago

So the v3 SDK method, paginateScan, does not support Dax Clients.

So to do a full scan on a table with 25,000 items, I need to do my own paginated Scan, which I am implementing, but its super slow, despite using Dax. With 1000 segments, its still timing out after 20 seconds.

Is there a Dax equivalent of paginateScan?

kwojcicki commented 3 months ago

So the v3 SDK method, paginateScan, does not support Dax Clients.

Not at my computer atm so I can't tell why this is the case? Is it a typing error or calling https://github.com/aws/aws-sdk-js-v3/blob/main/lib/lib-dynamodb/src/pagination/ScanPaginator.ts#L30 with a DAX client fails?

I do believe DAX's backend cluster supports scan so I would think lib-ddbs paginateScan could work

paul-uz commented 3 months ago

So the v3 SDK method, paginateScan, does not support Dax Clients.

Not at my computer atm so I can't tell why this is the case? Is it a typing error or calling https://github.com/aws/aws-sdk-js-v3/blob/main/lib/lib-dynamodb/src/pagination/ScanPaginator.ts#L30 with a DAX client fails?

I do believe DAX's backend cluster supports scan so I would think lib-ddbs paginateScan could work

Check lines 42-45. Dax client simply isn't supported.

My own paginated scan is super slow even for measly 1500 records. I've tried with and without using Segments. No improvement when using them.

kwojcicki commented 3 months ago

I added an example of how to fake shim the document client so that it works with dax

https://github.com/kwojcicki/amazon-dax-client-v3/blob/main/test/dynamoDBDocumentClient.js#L58

its probably something that could be moved into the core library but that will mess with everyone's instanceof/prototype checks which I would rather not do unless I absolutely have to

paul-uz commented 3 months ago

Thank you for the example!

I would argue it should be in core as paginateScan etc are crucial to being able to use dynamodb in any meaningful way.

kwojcicki commented 3 months ago

ya the implementation looks pretty simple to copy over and stable for the long term, I can add that

paul-uz commented 3 months ago

I tried using your example in my "parallel paginated scan" code, where I run multiple paginateScans, using segments.

I hit this error

DaxClientError: ConnectionException: write EPIPE
    at Socket.<anonymous> (/var/task/node_modules/amazon-dax-client-sdkv3/src/BaseOperations.js:110:23)
    at Socket.emit (node:events:519:28)
    at emitErrorNT (node:internal/streams/destroy:169:8)
    at emitErrorCloseNT (node:internal/streams/destroy:128:3)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
  time: 1717363720000,
  code: 'ConnectionException',
  retryable: true,
  requestId: null,
  statusCode: -1,
  _tubeInvalid: false,
  waitForRecoveryBeforeRetrying: false,
  '$metadata': { attempts: 1, totalRetryDelay: 0 }
}

function timeout is 1min, code errored after 24s

paul-uz commented 3 months ago

Ok, so after tweaking some numbers, ie the number of parallel paginateScans to run, its working. But, again, its not faster than DynamoDB itself, and I'm using 3x dax.r5.2xlarge nodes.

What am I missing here? I thought DAX was supposed to be faster?

EG using DAX client, scanning 24,500 records, in 40 segments, with pageSize 100, took 24s, whereas the normal DDBDocClient (with the same parallel scan settings) took 16s

kwojcicki commented 3 months ago

DAX is only faster (to my knowledge) if you are actually getting cache hits, otherwise its an extra server your packets have to hop through with 0 benefit.

DAX does track scan hits https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/dax-metrics-dimensions-dax.html if the scan hit rate is high and its still slower then its possible there's performance improvements to be made on the server/client code side, if the scan hit rate is low then the performance being slower is expected.

The performance will also depend on your data size, given you are going through 1-2 more hops (with DAX) if you are transferring 1GB records then its possible even with a high hit rate that DAX won't be super useful

paul-uz commented 3 months ago

The table I'm scanning is only 90mb.

Here is another error from today

DaxClientError: ConnectionException: Connection is closed by server
    at Socket.endListener (/var/task/node_modules/amazon-dax-client-sdkv3/src/BaseOperations.js:59:23)
    at Socket.emit (node:events:531:35)
    at endReadableNT (node:internal/streams/readable:1696:12)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
  time: 1717400590879,
  code: 'ConnectionException',
  retryable: true,
  requestId: null,
  statusCode: -1,
  _tubeInvalid: false,
  waitForRecoveryBeforeRetrying: false,
  '$metadata': { attempts: 1, totalRetryDelay: 0 }
}

This is with about 90 parallel (segments) paginateScans running.

I am seeing scan hits, but also a lot of misses. Re-runs don't seem to improve on response time.

kwojcicki commented 3 months ago

Added pagination to the core client as part of https://github.com/kwojcicki/amazon-dax-client-v3/commit/8769f4ad4725b7f8bc64d454753c13153aae12e7

also added the ability to modify the under http client params https://github.com/kwojcicki/amazon-dax-client-v3/commit/b917dbbf4a89fb3db6effa12abc5e41e02c24d10

i'll start stress testing my dax cluster more to reproduce the errors but I assume there is not much that can be done on the client side besides increasing timeouts/waiting between queries

kwojcicki commented 3 months ago

The table I'm scanning is only 90mb.

@paul-uz Approximately how many items is that spread over?

paul-uz commented 3 months ago

The table I'm scanning is only 90mb.

@paul-uz Approximately how many items is that spread over?

Approx 25,000 records

paul-uz commented 3 months ago

@kwojcicki i think some fo the new typings are a bit off.

The paginateScan config object wants a startingToken but this should be optional?

And I get this error on my client initialisation

Uploaded-using-RayThis-Extension

Type 'DynamoDBDocumentClient' is missing the following properties from type 'DynamoDBClient': getDefaultHttpAuthSchemeParametersProvider, getIdentityProviderConfigProvider

kwojcicki commented 2 months ago

I'll update those typings and modify my test scripts later to use typescript. Kind of blindly making the types for now and hoping they work.

Regrading the performance issues you are facing I can't seem to really replicate them.

I've got a 100mb DDB table (no GSI's). The data was setup as follows:

const randomString = (length) => {
    let result = '';
    const characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789';
    const charactersLength = characters.length;
    let counter = 0;
    while (counter < length) {
        result += characters.charAt(Math.floor(Math.random() * charactersLength));
        counter += 1;
    }
    return result;
}

for (var i = 0; i < 25_000; i++) {
    const putItem = new PutCommand({
        TableName: 'test',
        Item: {
            CommonName: `${i}`,
            Data1: randomString(800),
            Data2: randomString(800),
            Data3: randomString(800),
            Data4: randomString(800),
            Data5: randomString(800)
        }
    });

    await documentDaxClient.send(putItem);
}

Then I've got a 1769mb lambda and a 3 node (dax.r4.large) DAX cluster, doing a 90 segment scan against that.

let i = 0;
const totalSegments = 90;

const parallelScan = async (segmentId) => {
    const paginator = documentDaxClient.paginateScan({
        pageSize: 1000
    }, {
        TableName: 'test',
        Segment: segmentId,
        TotalSegments: totalSegments
    });

    for await (const val of paginator) {
        i += val.Items.length;
        if (i % 500 === 0) {
            console.log(val.Items[0]);
        }
    };
}

console.time("scan");
const promises = [];
for (let j = 0; j < totalSegments; j++) {
    promises.push(parallelScan(j));
}

await Promise.all(promises);
console.log(`scanned ${i} in`);
console.timeEnd("scan");

This takes between 3-4 seconds depending on the # of cache hits. Which is slower than the regular DDB version of this, but its also not really expected to be faster given

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/dax-prescriptive-guidance.html

In addition, for bulk reads, you should run full table scans directly against a DynamoDB table.

and some inefficiencies the DAX client has with sdkv3 that I'm trying to resolve.

But I don't see any timeouts nor issues like that, could you provide more info about your setup so I can reproduce your behavior

kwojcicki commented 2 months ago

and some inefficiencies the DAX client has with sdkv3 that I'm trying to resolve.

I removed the unnecessary round trip of json parsing that I initially introduced when upgrading this to sdkv3. May help slightly with the DAX scan times.