Set up Config Struct - Githubissues

cloudchacho / hedwig-firehose

Firehose can be used to archive all your messages for backup / replay / debugging purpose

Apache License 2.0

0 stars 4 forks source link

Set up Config Struct #4

Closed chongr closed 2 years ago

chongr commented 2 years ago

Subscribtions
- projects
- topics
- schema directory
Metadata Bucket (only name, no path)
- example: gs://standard-hedwig
Staging Bucket (only name, no path)
- example: gs://standard-hedwig
Output Bucket (name and path)
- example: gs://standard-hedwig/output
“Every configurable number of minutes, wake up, read all the worker’s files that exist”
- ScrapeInterval
“Write to a temporary path every configurable duration (say 30 seconds), or until a configurable content size is reached (say 5MB)”
- FlushAfter
- FlushAtSize

maroux commented 2 years ago

I want to be careful about this task - we shouldn't create a centralised "settings" struct that's passed into every constructor. Rather, only inject specific dependencies that the constructor uses.

I recently re-did hedwig-go like so, take a look at that.

Additionally, we should not need a config file just read from env vars since config file would mean berglas and the likes which we don't want to impose. So perhaps a config struct that's created in main.go and then pick fields from that struct for every constructor as objects are constructed.

chongr commented 2 years ago

ah okay yeah I was just looking at hedwig go - was planning to just set up something similar to this? https://github.com/cloudchacho/hedwig-go/blob/e5ab05b5a6ca8c5f2f82955464c919cb2ba2c4c1/gcp/gcp.go#L265

maroux commented 2 years ago

Yep so that links to the settings specific for gcp backend. And it's a struct rather than separate parameters since that's easier to update in the future. I think we shouldn't need a struct here since the app is going to be a deployable and the "external API" is just env vars. So we can get away with just parameters everywhere (assuming it doesn't balloon to like 5 params 😛).

maroux commented 2 years ago

If it's more than like 3 config parameters definitely a struct.

chongr commented 2 years ago

ah okay so maybe we don't need a struct and just always read from ENV for this info?

oh but I guess we still need similir to how its set up in hedwig GO for consuming from the queue since it should be cloud agnostic

maroux commented 2 years ago

Well ideally the only thing that reads from env is the entry point or something close to that - but still only in one place. So a config struct might be okay and should be the only thing that reads from env. But a function or set of functions in main.go might work also.

Also yes we'll want to make it cloud agnostic side we'd have to provide env vars for configuring gcp as well. Would that lead to too many vars and maybe a file is better just to keep things in one place?

shsamkit commented 2 years ago

I was thinking we could pass all the configurations as command-line arguments. We can inject those environment variables with defaults using [Shell Parameter Expansion](https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html. Or may be pass a file path via command-line args to load the configs from

chongr commented 2 years ago

settings set up in https://github.com/cloudchacho/hedwig-firehose/pull/9, but https://github.com/cloudchacho/hedwig-firehose/issues/4#issuecomment-1094698945 will still have to be addressed as to how they are added on start up