Consolidate s3 key config options

MeltanoLabs / target-athena

Singer.io Target for AWS Athena.

Other

5 stars 16 forks source link

Consolidate s3 key config options #5

Closed andrewcstewart closed 3 years ago

andrewcstewart commented 3 years ago

There are currently three different config fields related to s3 data location keys:

s3_key_prefix - Used in defining target_key for s3 upload, and used in data_locatioin for athena table definition.
naming_convention - Used only for defining target_key for s3 upload.
s3_staging_dir - Not being used in the sink function, but apparently still set as 'required' somewhere. (will fail CLI if not set)

We can probably consolidate these into at least one config option.

andrewcstewart commented 3 years ago

@aaronsteers - I'm personally of the opinion that we could just do away with naming_convention.. I think overly complicates key construction and doesn't particularly add any value to the user.

aaronsteers commented 3 years ago

@andrewcstewart

I'm personally of the opinion that we could just do away with naming_convention.

Agreed.

s3_staging_dir - Not being used in the sink function, but apparently still set as 'required' somewhere. (will fail CLI if not set)

I was confused about that also. This seemed to be implying we were needing multiple directories. Let's remove if not needed.

naming_convention - Used only for defining target_key for s3 upload.

I think we can remove for now, and re-add when/if we introduce path-based partitioning in #15.

aaronsteers commented 3 years ago

Bringing bucket back into this conversation, what if we standardize on just these two:

s3_bucket - The bucket.
s3_key_prefix - The 'prefix' indicator makes it clear to me that there will be multiple auto-generated files under this path, and the exact file names are not controlled via config. (At least not until we get to #15, but even then we may use automatic name generation based on higher-order config.)

What do you think?