aws / aws-sdk-ruby

The official AWS SDK for Ruby.
https://aws.amazon.com/sdk-for-ruby/
Apache License 2.0
3.57k stars 1.23k forks source link

Unnecessary DescribeWorkflowExecution calls when accessing closed_at #393

Closed awendt closed 11 years ago

awendt commented 11 years ago

I'm trying to find specific workflow executions by tag:

> workflow_executions = swf.domains[domain_name].workflow_executions.closed_after(Time.now-1.hour).tagged(tag_name)
#=> <AWS::SimpleWorkflow::WorkflowExecutionCollection>
> workflow_executions.each {|ex| puts ex.started_at}
AWS SimpleWorkflow 200 0.39347 0 retries] list_closed_workflow_executions(…)
2013-10-22 08:04:41 +0200
2013-10-22 08:04:46 +0200
=> nil

So far, so good. Now I'd like to access not only started_at, but also closed_at:

> workflow_executions = swf.domains[domain_name].workflow_executions.closed_after(Time.now-1.hour).tagged(tag_name)
#=> <AWS::SimpleWorkflow::WorkflowExecutionCollection>
> workflow_executions.each {|ex| puts ex.started_at; puts ex.closed_at}
[AWS SimpleWorkflow 200 0.168692 0 retries] list_closed_workflow_executions(…)
[AWS SimpleWorkflow 200 0.105952 0 retries] describe_workflow_execution(…)
[AWS SimpleWorkflow 200 0.190124 0 retries] describe_workflow_execution(…)
2013-10-22 08:04:41 +0200
2013-10-22 08:04:46 +0200
=> nil

That means the SDK makes another call to the DescribeWorkflowExecution API for every workflow execution just because I'm accessing closed_at of that workflow execution.

This is unnecessary because ListClosedWorkflowExecutions already returns closeTimestamp in addition to startTimestamp.

AFAICS, the same information for n workflow executions can be obtained in 1 call instead of n+1 calls.

Am I missing something here?

trevorrowe commented 11 years ago

The SDK is only holding onto static attributes. Calling #closed_after on a workflow execution will trigger a new request every time to see if the value has updated service side. This is sub-optimal for tight loops like you have above.

You can work around this by enabling memoization:

AWS.memoize do
  # only one request required
  workflow_executions.each {|ex| puts ex.started_at; puts ex.closed_at}
end

All attributes are held inside a memoize block. The block indicates it is okay to cache non-static attributes within a given scope. I'm a not fan of this and we will remove the need to use memoize blocks in v2 of the SDK.

awendt commented 11 years ago

@trevorrowe Thanks for the quick response. And thanks for clarifying that you're not a fan of this, Github sent me an e-mail with a comment that didn't make sense, I guess that was before an edit :)

Anyway, thanks for pointing me to AWS.memoize. I wasn't aware that closeTimestamp could be updated.

trevorrowe commented 11 years ago

It can't be updated or specified, but it can go from nil, which means not closed to an actual timestamp. Static attributes are those like a resource identifier or a creation timestamp that never change.

I'm going to close this issue for now. I hope this helps.