hicommonwealth / commonwealth

A platform for decentralized communities
https://commonwealth.im
GNU General Public License v3.0
67 stars 42 forks source link

config might be an anti-pattern. #8900

Open burtonator opened 3 weeks ago

burtonator commented 3 weeks ago

TLDR

Our config object is too large. I think we should collapse it all into an object like this:

const FANCY_BUTTONS_CONFIG_DISABLED_SCHEMA = z.object({
  NODE_ENV: z.literal('development').optional(),
  FANCY_BUTTONS_ENABLED: z.coerce.boolean.default(true),
});

const FANCY_BUTTONS_CONFIG_ENABLED_SCHEMA = z.object({
  NODE_ENV: z.literal('production'),
  FANCY_BUTTONS_ENABLED: z.coerce.boolean.default(false),
});

// then the config schema is a union of the two.
const FANCY_BUTTONS_CONFIG_SCHEMA = z.union([
  FANCY_BUTTONS_CONFIG_DISABLED_SCHEMA,
  FANCY_BUTTONS_CONFIG_ENABLED_SCHEMA,
]);

Then we can initialize it directly from process.env or a dictionary/map (so it's testable).

This allows us to:

Describe the bug

I think we could clean up our config object significantly and also make it more testable.

I think we might have accidentally implemented an anti-pattern with config.

I think it's similar to the service locator pattern:

https://en.wikipedia.org/wiki/Service_locator_pattern

The registry makes code harder to test, since all tests need to interact with the same global service locator class to set the fake dependencies of a class under test. However, this is easily overcome by injecting application classes with a single service locator interface. Simulator can be implemented to simulate each interface provided by the service locator, so it's easy to swap the real implementation with a simulator.

But specifically the point that it makes it harder to test.

The main point though is that I think we cut about 2/3rds of the code plus make it less fragile.

Disadvantages of the current system

image

Proposal

Advantages

This approach I think is much better. Very demure. Very mindful.

Code

Here's how we would do it:


// We first define a Zod schema for when we're disabled.
const SITEMAP_CONFIG_DISABLED_SCHEMA = z.object({
  SITEMAP_ENABLED: z.literal('false').optional(),
});

// then we define one for when we're enabled.
const SITEMAP_CONFIG_ENABLED_SCHEMA = z.object({
  SITEMAP_ENABLED: z.literal('true'),
  SITEMAP_THREAD_PRIORITY: z.coerce.number().default(0.8),
});

// then the config schema is a union of the two.
const SITEMAP_CONFIG_SCHEMA = z.union([
  SITEMAP_CONFIG_ENABLED_SCHEMA,
  SITEMAP_CONFIG_DISABLED_SCHEMA,
]);

then to load the schema all we have to do is have

export conf SITEMAP_CONFIG = SITEMAP_CONFIG_SCHEMA.parse(process.env)

Advantages to my proposal

Code is easier to test

Each config object would just test it's config but when we smash everything into one config we end up needing to test 2^N configs which isn't doable.

Also, the tests are going to be more readable this way.

Ability to have defaults for non-production

If we write code like this.. we can have different defaults for production.

It could also depend on NODE_ENV too.

Tests will be more readable

The testing code will be super small.

  describe('SITEMAP_SCHEMA', () => {
    test('everything disabled', () => {
      SITEMAP_SCHEMA.parse({});
    });

    test('sitemap disabled', () => {
      SITEMAP_SCHEMA.parse({
        SITEMAP_ENABLED: 'false',
      });
    });

    test('sitemap with custom value', () => {
      SITEMAP_SCHEMA.parse({
        SITEMAP_ENABLED: 'false',
        SITEMAP_THREAD_PRIORITY: 0.2,
      });
    });
  });

Issues

Logging

@timolegros mentioned that he wanted logging and the ability to see which process.env variables were loaded at runtime.

We can implement this via a function called configurize() which would read process.env, then look at the resulting parsed object to see what was loaded, then console.log that...

timolegros commented 3 weeks ago

the shared lib can't use any config because the model config depends on shared and we'd get an import cycle if we tried to fix it.

The shared lib will never be able to use any config setup because config depends on dotenv and dotenv cannot be imported on the client. So even with your broken down config this will not change.

all config has to go in the same place which makes testing difficult to impossible

Technically could define your Zod objects separately from the config object but link them in the config object. That way you can test and we maintain the top-level object.

no refine functions

Are you sure? How are you going to define relationships between different env var like the following without refine:

LOG_LEVEL: z
  .enum(LogLevels)
  .refine((data) => !(APP_ENV === 'production' && data !== 'info')),

We can implement this via a function called configurize() which would read process.env, then look at the resulting parsed object to see what was loaded, then console.log that...

Isn't that the same thing as using the existing configure function but multiple times for each 'group' of env var instead of all in one place?

How would you replace the server/scripts/releasePhaseEnvCheck.ts script with a broken-down config?

burtonator commented 3 weeks ago

The shared lib will never be able to use any config setup because config depends on dotenv and dotenv cannot be imported on the client. So even with your broken down config this will not change.

We might not want shared to use dotenv though... That might not be worth it. However, there might be other libs that we need to get working with this.

Maybe this is a smaller issue though.

Technically could define your Zod objects separately from the config object but link them in the config object. That way you can test and we maintain the top-level object.

Possibly but our current library would have to be rewritten.

This is similar to my approach though.

no refine functions

Are you sure? How are you going to define relationships between different env var like the following without refine:

const FANCY_BUTTONS_CONFIG_DISABLED_SCHEMA = z.object({
  NODE_ENV: z.literal('development').optional(),
  FANCY_BUTTONS_ENABLED: z.coerce.boolean.default(true),
});

const FANCY_BUTTONS_CONFIG_ENABLED_SCHEMA = z.object({
  NODE_ENV: z.literal('production'),
  FANCY_BUTTONS_ENABLED: z.coerce.boolean.default(false),
});

// then the config schema is a union of the two.
const FANCY_BUTTONS_CONFIG_SCHEMA = z.union([
  FANCY_BUTTONS_CONFIG_DISABLED_SCHEMA,
  FANCY_BUTTONS_CONFIG_ENABLED_SCHEMA,
]);

This basically makes FANCY_BUTTONS_ENABLED=true when in development and false when in production (unless you manually set it to true)

Isn't that the same thing as using the existing configure function but multiple times for each 'group' of env var instead of all in one place?

How would you replace the server/scripts/releasePhaseEnvCheck.ts script with a broken-down config?

Yeah. It would be similar. You would just hard code all of the configs and then dump them together.

timolegros commented 3 weeks ago

We might not want shared to use dotenv though

That's exactly my point i.e. trying to get env var into /shared is neither desirable nor even possible...

const FANCY_BUTTONS_CONFIG_DISABLED_SCHEMA = z.object({
  NODE_ENV: z.literal('development').optional(),
  FANCY_BUTTONS_ENABLED: z.coerce.boolean.default(true),
});

const FANCY_BUTTONS_CONFIG_ENABLED_SCHEMA = z.object({
  NODE_ENV: z.literal('production'),
  FANCY_BUTTONS_ENABLED: z.coerce.boolean.default(false),
});

// then the config schema is a union of the two.
const FANCY_BUTTONS_CONFIG_SCHEMA = z.union([
  FANCY_BUTTONS_CONFIG_DISABLED_SCHEMA,
  FANCY_BUTTONS_CONFIG_ENABLED_SCHEMA,
]);

This is not the same as using refine to check dependent variables defined elsewhere. Sure it works if you are careful but then every config that depends on other variables would redefine those variables e.g. A.NODE_ENV, B.NODE_ENV, C.NODE_ENV -> all potentially derived from a different source (e.g. different defaults).