grailbio / reflow

A language and runtime for distributed, incremental data processing in the cloud
Apache License 2.0
965 stars 52 forks source link

ResourceNotFoundException #97

Closed lincoln-harris closed 5 years ago

lincoln-harris commented 5 years ago

Hey guys I keep encountering this ResourceNotFoundException error. Reflow seems to be having trouble finding s3 files that definitely do exist. Heres what happens when I do reflow run on a single test file:

wierd error

(note that I'm seeing the same errors when I try reflow runbatch). Could it be that Reflow is unable to access certain cached values for some reason?

prasadgopal commented 5 years ago

can you run with reflow -log=debug run ... and show me the output?

On Mon, Dec 17, 2018 at 9:54 AM Lincoln Harris notifications@github.com wrote:

Hey guys I keep encountering this ResourceNotFoundException error. Reflow seems to be having trouble finding s3 files that definitely do exist. Heres what happens when I do reflow run on a single test file:

[image: wierd error] https://user-images.githubusercontent.com/33501625/50105230-fd91be80-01e0-11e9-8972-81d518da9c44.png (note that I'm seeing the same errors when I try reflow runbatch). Could it be that Reflow is unable to access certain cached values for some reason?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/grailbio/reflow/issues/97, or mute the thread https://github.com/notifications/unsubscribe-auth/AfC0Q_XBTU7AoeOtpKYQnXtxbu4AccORks5u59pHgaJpZM4ZW2Lw .

--

This email message, including attachments, may contain private, proprietary, or privileged information and is the confidential information and/or property of GRAIL, Inc., and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

lincoln-harris commented 5 years ago

yep, the output is right here debug_out.txt

prasadgopal commented 5 years ago

Have you setup your cache? It seems like the cache specified in your config doesn't exist? Have you run the following? reflow setup-dynamodb-assoc <tablename>

If yes, can I see the contents of /tmp/config after running the following command: reflow config -marshal > /tmp/config

lincoln-harris commented 5 years ago

Yes, the cache has been setup, with AWS_SDK_LOAD_CONFIG=1 reflow setup-dynamodb-assoc czbiohub-reflow-quickstart, though I have reason to believe it got messed up at some point. I had to switch the cache from a personal s3 bucket to the cz-biohub centralized cache, and ive been seeing these resource errors ever since

reflow config -marshal > /tmp/config doesnt return anything at all

olgabot commented 5 years ago

@lincoln-harris the output of reflow config -marshal was written to the file /tmp/config. Can you show the contents of it with cat /tmp/config ?

prasadgopal commented 5 years ago

Sorry. I did not realize that you had all these credentials in your config. Can you please delete you message and invalidate your credentials asap?

prasadgopal commented 5 years ago

To fix the issue you need the following line in your config: assoc: dynamodb,czbiohub-reflow-quickstart

prasadgopal commented 5 years ago

The setup-dynamodb-assoc command should automatically populate your config with this information. Not sure why it isn't happening in your case. Does setup-dynamodb-assoc throw any error?

lincoln-harris commented 5 years ago

ok update -- still seeing the same error, even when assoc: dynamodb,czbiohub-reflow-quickstart is in my config. I get this error when i try to setup the dynamodb

screen shot 2018-12-18 at 4 39 36 pm
prasadgopal commented 5 years ago

Does dynamodb,czbiohub-reflow-quickstart exists? can you try running "aws dynamodb describe-table --table-name czbiohub-reflow-quickstart" and see the table exists?

Looks like you don't have permissions to create a new table ( czbiohub-reflow-quickstart-cache).

On Tue, Dec 18, 2018 at 4:40 PM Lincoln Harris notifications@github.com wrote:

ok update -- still seeing the same error, even when assoc: dynamodb,czbiohub-reflow-quickstart is in my config. I get this error when i try to setup the dynamodb

[image: screen shot 2018-12-18 at 4 39 36 pm] https://user-images.githubusercontent.com/33501625/50191626-9446a400-02e3-11e9-9b26-954036a99c33.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/grailbio/reflow/issues/97#issuecomment-448426127, or mute the thread https://github.com/notifications/unsubscribe-auth/AfC0QyRJQJWJbGO_XXl-y4ZItvBxdtjiks5u6Yr-gaJpZM4ZW2Lw .

--

This email message, including attachments, may contain private, proprietary, or privileged information and is the confidential information and/or property of GRAIL, Inc., and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

olgabot commented 5 years ago

I"m able to do this on my local machine but not on the EC2 instance, because the IAM roles aren't configured properly for that instance. Can you post a complete list of all AWS IAM permissions necessary for successfully using reflow in https://github.com/grailbio/reflow/issues/99?