Saasli / saasli-backend

Documentation
https://saasli.github.io/docs/
0 stars 0 forks source link

[mysql-poll] Multiple invocations of events #51

Open godd9170 opened 7 years ago

godd9170 commented 7 years ago

Should the events lambda function take a long period of time (over ~30s it would seem) it seems to request again and again until lambda timeout.

Here is the logs of an invocation that takes a reasonable amount of time: screen shot 2016-11-09 at 4 38 08 pm

Now the logs of the multi-invocation error. screen shot 2016-11-09 at 4 38 16 pm

Notice the multiple Starting new HTTPS connection (2): lambda.us-east-1.amazonaws.com lines.

This guy seems to think that threading might be to blame.

godd9170 commented 7 years ago

So this one is twofold.

1) Salesforce Bulk is known to experience periods of painful slowness. This article here suggests that it's related to 'on' and 'off' peak hours. I'm also inclined to believe that Sandboxes are slower.

2) botocore the framework boto3 resides upon, defaults connection timeouts to 60s. While it's not really solving the problem of slow-ass Salesforce, we can at least modify that timeout like this:

import boto3, botocore

config = botocore.config.Config(connect_timeout=250, read_timeout=270)
self.client = boto3.client('lambda', config=config)

Maxing this out will have to do for now, as we've no control over Salesforce taking an inconsistent, arbitrary amount of time that can easily exceed the 60s default.

godd9170 commented 7 years ago

Complained and followed up over at https://github.com/boto/boto3/issues/883

Wouldn't you know it, Salesforce Bulk seems to be running at the 1s intervals I came to rely on when I originally built this. So testing the increased timeout isn't entirely verified.

I introduced an artificial delay in the events lambda to see if we've successfully overthrown the multi request problem. It looks as though this does in fact extend the period between timeouts. See:

screen shot 2016-11-09 at 7 16 46 pm

I suppose if we just jack the timeout well beyond the allowable range of the lambda, we'll never have the issue of duplicate requests. We can also then pass the buck to Salesforce for sucking. Having some robust logs as proposed in #34 could potentially allow us to re-run failed jobs for data fidelity.

godd9170 commented 7 years ago

Salesforce-bulk is probably best used for CRUD operations exclusively, we should just perform queries the good old fashioned RESTful way. Opening #52 to discuss that.

godd9170 commented 7 years ago

This is happening again, I don't think the MySQL Poll got the same patch. Opening for now while I investigate.

screen shot 2016-12-02 at 1 59 04 pm
godd9170 commented 7 years ago

I just downloaded the zipped lambda package and it's definitely got the timeouts jacked up to 500s. I'll need to dive a bit deeper into this.