Draft: Validate interpolation in register

mtgoncalves1 commented 1 year ago

Overview

Closes https://gitlab.com/architect-io/architect-cli/-/issues/620.
Validate interpolation of component during register.

Changes

Add validation checks for interpolations

Tests

The following should fail

name: hello-world
services:
  api:
    build:
      context: .
    interfaces:
      main: 3000
    environment:
      WORLD_TEXT: ${{ nonsense.world_text }}

Test dependencies

name: hello-world
dependencies:
  authentication: latest
services:
  api:
    build:
      context: ./
    environment:
      AUTH_INTERNAL: ${{ dependencies.nonsense.services.auth.interfaces.main.url }}

Test database

name: hello-world
services:
  api:
    build:
      context: ./
    environment:
      DB_ADDR: ${{ databases.nonsense.url }}
databases:
  top-db:
    type: postgres:12
    connection_string: ${{ secrets.db_connection_string }}

mtgoncalves1 commented 1 year ago

I've been playing around with this and it seems to help, but it's not quite fully validating interpolation as part of the register.

Using this as my component:
name: helow
description: A simple hello-world component that returns "Hello World!" on every request.
homepage: https://github.com/architect-team/architect-cli/tree/master/examples/hello-world

secrets:
  foo:
    required: false

dependencies:
  api2: latest

services:
  api:
    build:
      context: .
    interfaces:
      main:
        port: 8080
        ingress:
          subdomain: hello
    environment:
      FOO: ${{ secrets.foo }}
      BAR: ${{ dependencies.api2.banana.salad }}
      BAZ: ${{ dependencies.api3.interfaces.main.post }}
and running register, I get:
  24 |     environment:
  25 |       FOO: ${{ secrets.foo }}
  26 |       BAR: ${{ dependencies.api2.banana.salad }}
› 27 |       BAZ: ${{ dependencies.api3.interfaces.main.post }}
     |                ﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋ Invalid interpolation ref: ${{ dependencies.api3.interfaces.main.post }}
  28 |
which isn't quite correct - dependencies.api2 is correct, but banana.salad is nonsense.

If instead we changed this:
const graph = await dependency_manager.getGraph([instanceToInstance(component_spec)], undefined, { interpolate: false, validate: false });
to
const graph = await dependency_manager.getGraph([instanceToInstance(component_spec)], undefined, { interpolate: true, validate: true });
(on rc, without these changes) we do get a more accurate error:
  23 |           subdomain: hello
  24 |     environment:
  25 |       FOO: ${{ secrets.foo }}
› 26 |       BAR: ${{ dependencies.api2.banana.salad }}
     |                ﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋ Invalid interpolation ref: ${{ dependencies.api2.banana.salad }}
› 27 |       BAZ: ${{ dependencies.api3.interfaces.main.post }}
     |                ﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋ Invalid interpolation ref: ${{ dependencies.api3.interfaces.main.post }} - Did you mean ${{ services.api.interfaces.main.port }}?
  28 |
and api2.banana.salad is recognized as invalid.

The problem with just doing this, though, is secrets - if I change the foo secret to be required, suddenly we get:
   5 | secrets:
   6 |   memory:
   7 |     default: .25GB
›  8 |   foo:
     |   ﹋﹋ Required secret 'foo' was not provided
   9 |     required: true
  10 |
  11 | dependencies:
which isn't great because register doesn't accept secrets (yet, it will for some scenarios with this #903).

What I think we actually want to do the validation that's happening within getGraph, without the part that cares about whether secrets are actually defined. I don't know the history of register but it's kind of weird we have a validateInterpolation function but purposely skip the real interpolation validation in getGraph... curious if @ryan-cahill maybe knows more about that and there's maybe a better way to implement the more in depth check without making changes to getGraph necessarily 🤔

Dependencies are a bit complicated to handle. The reason why I handled it that way was because a component is allowed to be registered before its dependencies so then even though the url may seem correct, the program still yields an error.

  21 |     environment:
  22 |       FOO: ${{ secrets.foo }}
› 23 |       BAZ: ${{ dependencies.api2.services.api.interfaces.main.url }}
     |                ﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋﹋ Invalid interpolation ref: ${{ dependencies.api2.services.api.interfaces.main.url }}
  24 |

I agree the current validateInterpolation function doesn't do what the name says.

TylerAldrich commented 1 year ago

One bug I found as well is with this yaml:

services:
  api:
    build:
      context: .
    interfaces:
      main:
        port: 8080
        ingress:
          subdomain: hello
    environment:
      FOO: ${{ secrets.foo }}
      BAR: ${{ services.api2.interfaces.banana.port }}
      BAZ: ${{ dependencies.api3.interfaces.main.post }}

  api2:
    build:
      context: .
    interfaces:
      main:
        port: 8000

I'm referencing a dependency w/ no dependency block - the error though is

Cannot use 'in' operator to search for 'api3' in undefined
TypeError: Cannot use 'in' operator to search for 'api3' in undefined
    at findInterpolationErrors (/Users/tyler/code/architect/architect-cli/src/dependency-manager/spec/utils/spec-validator.ts:373:28)
    at validateInterpolation (/Users/tyler/code/architect/architect-cli/src/dependency-manager/spec/utils/spec-validator.ts:410:26)
    at ComponentRegister.registerComponent (/Users/tyler/code/architect/architect-cli/src/commands/register.ts:177:26)
    at async ComponentRegister.run (/Users/tyler/code/architect/architect-cli/src/commands/register.ts:130:7)
    at async ComponentRegister._run (/Users/tyler/code/architect/architect-cli/node_modules/@oclif/core/lib/command.js:80:22)
    at async Config.runCommand (/Users/tyler/code/architect/architect-cli/node_modules/@oclif/core/lib/config/config.js:272:25)
    at async Object.run (/Users/tyler/code/architect/architect-cli/node_modules/@oclif/core/lib/main.js:76:5)

which isn't ideal 😛.

That's a good point about dependencies though - it looks like service references are working properly and it's just dependencies that don't, so it may be totally fine to do things this way

mtgoncalves1 commented 1 year ago

Thank you for catching that bug! I'll look into it.

mueschm commented 1 year ago

I worry that this approach is kind of an odd band aid to the situation. It relies very heavily on understanding the spec instead of validating the actual connections.

If we could create mock nodes for dependencies in the graph with a concept of loose validation I think most of the current logic would just work. This could reduce the amount of regular expressions this system needs and allow us to think of the validation purely in terms of it being a graph rather than a spec.

For loose validation I am just thinking that nodes can be marked as in an unknown state and therefore we can dynamically use the rest of the spec to validate. The hard part here is in the context map. Though what we could do is just dynamically parse out the spec and have * for any value. So the context map may include

dependencies.valid.services.*.interfaces.*.host

These values would be generated by using the spec ts files. Now the rest of the validation logic I believe would be valid. Without having to worry about regular expression parsing. The other nice part is we know when things are a dictionary so pretty much all of those values for the context map can just be generated by reading the spec at runtime.

My main concern is based on our current roadmap our graph structure is going to like change a lot and this feels too brittle to me at the moment.

mtgoncalves1 commented 1 year ago

I worry that this approach is kind of an odd band aid to the situation. It relies very heavily on understanding the spec instead of validating the actual connections.

If we could create mock nodes for dependencies in the graph with a concept of loose validation I think most of the current logic would just work. This could reduce the amount of regular expressions this system needs and allow us to think of the validation purely in terms of it being a graph rather than a spec.

For loose validation I am just thinking that nodes can be marked as in an unknown state and therefore we can dynamically use the rest of the spec to validate. The hard part here is in the context map. Though what we could do is just dynamically parse out the spec and have * for any value. So the context map may include
dependencies.valid.services.*.interfaces.*.host
These values would be generated by using the spec ts files. Now the rest of the validation logic I believe would be valid. Without having to worry about regular expression parsing. The other nice part is we know when things are a dictionary so pretty much all of those values for the context map can just be generated by reading the spec at runtime.

My main concern is based on our current roadmap our graph structure is going to like change a lot and this feels too brittle to me at the moment.

@TylerAldrich I spoke with Michael and this seems to be a good approach. What do you think?

TylerAldrich commented 1 year ago

I worry that this approach is kind of an odd band aid to the situation. It relies very heavily on understanding the spec instead of validating the actual connections. If we could create mock nodes for dependencies in the graph with a concept of loose validation I think most of the current logic would just work. This could reduce the amount of regular expressions this system needs and allow us to think of the validation purely in terms of it being a graph rather than a spec. For loose validation I am just thinking that nodes can be marked as in an unknown state and therefore we can dynamically use the rest of the spec to validate. The hard part here is in the context map. Though what we could do is just dynamically parse out the spec and have * for any value. So the context map may include
dependencies.valid.services.*.interfaces.*.host
These values would be generated by using the spec ts files. Now the rest of the validation logic I believe would be valid. Without having to worry about regular expression parsing. The other nice part is we know when things are a dictionary so pretty much all of those values for the context map can just be generated by reading the spec at runtime. My main concern is based on our current roadmap our graph structure is going to like change a lot and this feels too brittle to me at the moment.
@TylerAldrich I spoke with Michael and this seems to be a good approach. What do you think?

That sounds reasonable to me, it sounds like it addresses my concerns about the looseness of validateInterpolation 👍

mtgoncalves1 commented 1 year ago

I worry that this approach is kind of an odd band aid to the situation. It relies very heavily on understanding the spec instead of validating the actual connections. If we could create mock nodes for dependencies in the graph with a concept of loose validation I think most of the current logic would just work. This could reduce the amount of regular expressions this system needs and allow us to think of the validation purely in terms of it being a graph rather than a spec. For loose validation I am just thinking that nodes can be marked as in an unknown state and therefore we can dynamically use the rest of the spec to validate. The hard part here is in the context map. Though what we could do is just dynamically parse out the spec and have * for any value. So the context map may include
dependencies.valid.services.*.interfaces.*.host
These values would be generated by using the spec ts files. Now the rest of the validation logic I believe would be valid. Without having to worry about regular expression parsing. The other nice part is we know when things are a dictionary so pretty much all of those values for the context map can just be generated by reading the spec at runtime. My main concern is based on our current roadmap our graph structure is going to like change a lot and this feels too brittle to me at the moment.
@TylerAldrich I spoke with Michael and this seems to be a good approach. What do you think?
That sounds reasonable to me, it sounds like it addresses my concerns about the looseness of validateInterpolation 👍

Cool cool. Thank you!

mtgoncalves1 commented 1 year ago

The symbol ø is being used instead of * (dependencies.valid.services.ø.interfaces.ø.host) because * is already used as a binary operator in the parseExpression logic. Let me know if other symbol is more suitable for this wildcard but the following do not work **, !, ?, ~.

architect-team / architect-cli