risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
7.06k stars 581 forks source link

feat(connector): add session token for s3 connector #19458

Closed wcy-fdu closed 1 day ago

wcy-fdu commented 3 days ago

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Previously, when using the S3 connector, users were required to configure ak sk, which was fixed. This pr adds the session token field to support temporary ak sk created by users. Both s3 source and s3 sink is added.

Checklist

Documentation

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

wcy-fdu commented 3 days ago

Will test locally and then request review.

wcy-fdu commented 1 day ago

Think twice, we should allow setting session_token only for batch query, as it's a bit dangerous if streaming file source using a secret key that may have expired, causing some unpredictable errors.

However, we cannot know when creating a source whether it will be used for batch or streaming:

In other words, if the source needs to support session tokens, it cannot be disabled for the streaming source when creating it.

Thus, I suggest when creating a source, only fixed keys that do not expire are allowed.

We can add session_token for tvf file_scan() if needed.