PrefectHQ / prefect

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
https://prefect.io
Apache License 2.0
17.2k stars 1.63k forks source link

S3Result with target raises error #2585

Closed marvin-robot closed 4 years ago

marvin-robot commented 4 years ago

Archived from the Prefect Public Slack Community

livni.itay: Hi - I am working with S3Result and receiving a

botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist

Which upon further research - it can be anything including a permission error. (I tried different buckets with settings) The credentials are stored as AWS_CREDENTIALS in prefect cloud. With the config.toml set to use cloud secrets

[cloud]
use_local_secrets = false

Switching back to result_handler argument with S3Result subclass did work, . And combining result handler with target does not. Is there something different in the way that credentials are handled between result and result_handler?

The new prefect is really nice :slightly_smiling_face:

chris: Hi itay - could you share the code you used to initialize the result_handler and the result?

livni.itay: tsx_imb_res = S3Result(bucket="tsx-moc-bcp")

livni.itay:

@task(
    max_retries=3,
    retry_delay=timedelta(seconds=1), # In production this will be change to 20 minutes
    result_handler=tsx_imb_res, 
    target="{task_name}-{today}",
    state_handlers=[imb_handler, error_handler]
)

livni.itay: Works with target commented out

chris: Ah! The result_handler kwarg is now deprecated, so you should instead try:

...
result=tsx_imb_res,
...

livni.itay: Right that does not work

livni.itay: That is the problem

chris: ahhh interesting! OK so this might actually be a bug with our exists logic on the S3Result type. Would you mind sharing this example code + the traceback you’re seeing? Sorry about that!

livni.itay: Actually it looks like I am not using target right?

[2020-05-16 21:31:37] INFO - prefect.FlowRunner | Beginning Flow run for 'Our first flow'
[2020-05-16 21:31:37] INFO - prefect.FlowRunner | Starting flow run.
[2020-05-16 21:31:37] INFO - prefect.TaskRunner | Task 'tsx_url': Starting task run...
[2020-05-16 21:31:37] INFO - prefect.TaskRunner | Task 'tsx_url': finished task run for task with final state: 'Success'
[2020-05-16 21:31:37] INFO - prefect.TaskRunner | Task 'get_tsx_moc_imb': Starting task run...
[2020-05-16 21:31:38] ERROR - prefect.TaskRunner | Unexpected error: NoSuchKey('An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.')
Traceback (most recent call last):
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 651, in check_target
    if result.exists(target, **prefect.context):
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/prefect/engine/results/s3_result.py", line 167, in exists
    Bucket=self.bucket, Key=location.format(**kwargs)
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/botocore/client.py", line 626, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.
[2020-05-16 21:31:38] INFO - prefect.TaskRunner | Task 'get_tsx_moc_imb': finished task run for task with final state: 'Skipped'
[2020-05-16 21:31:38] INFO - prefect.FlowRunner | Flow run SUCCESS: all reference tasks succeeded

livni.itay: target="{task_name}-{today}",

chris: your code looks alright to me actually, including your target specification; I think this exception catching logic here is flawed: https://github.com/PrefectHQ/prefect/blob/master/src/prefect/engine/results/s3_result.py#L169

chris: it’s possible this was tested on a different version of boto3 or something, we’ll need to investigate a little deeper

livni.itay: Cool. Let me know if you need anything more.

chris: I’ll use our bot to open the issue and we can track progress there

chris: <@ULVA73B9P> archive “S3Result with target raises error”

Original thread can be found here.

cicdw commented 4 years ago

Opening because this is still an active issue

cicdw commented 4 years ago

From the thread, relevant package versions are:

botocore: 1.15.32
boto3: 1.12.32
joshmeek commented 4 years ago

Oh interesting, looks like we'll want to check something like:

if ex.response['Error']['Code'] == 'NoSuchKey':
    return False

or

except client.exceptions.NoSuchKey

boto3 has minimal documentation on this

gryBox commented 4 years ago

@joshmeek Completely disregard this. I upgraded and tested on two different enviroments. It works like a charm :)

Hi @joshmeek - I upgraded to prefect 0.11.2 and tried running:

from prefect.engine.results import S3Result

s3_result = S3Result(bucket='some-bucket')

task(target="{today}/{task_name}.prefect")
def some_func():
   ....

with Flow("Some Flow", result=s3_result) as sf:
   .....

And I am receiving the same error NoSuchKey . Without a target the results are succesfully stored in the s3 bucket

[2020-05-21 02:07:14] ERROR - prefect.TaskRunner | Unexpected error: NoSuchKey('An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.')
Traceback (most recent call last):
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 651, in check_target
    if result.exists(target, **prefect.context):
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/prefect/engine/results/s3_result.py", line 167, in exists
    Bucket=self.bucket, Key=location.format(**kwargs)
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/ilivni/miniconda3/envs/py37moc/lib/python3.7/site-packages/botocore/client.py", line 626, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject operation: The specified key does not exist.

To be clear. I assumed that prefect generates a key based on the target template. Is that not the case?