Closed britton-from-notion closed 1 month ago
Thanks @britton-from-notion for the detailed write-up! It shouldn't take much effort to get this working because the necessary parts already exist.
Where my idea might deviate from this is that I would hope to see this functionality built into a single substation binary. I would prefer to be able to write substation test .jsonnet directly in my CLI and also be able to write substation build .jsonnet to build a substation app or substation run *.json to run it (this could be opened as a separate issue, since I know this deviates from how substation works. But a central, vended, entrypoint for my substation workflow including unit testing mentally clicks a lot better in my head than a dedicated unit testing CLI .)
Agreed that this is a separate issue. I think what you're describing is a utility application for managing source code and configurations. Most of the apps under cmd/development could fit within that, with some changes. For example:
substation build
-- builds config files (wraps jsonnet)substation fmt
-- formats config files (wraps jsonnetfmt)substation test
-- runs config unit testssubstation version
-- pull local release version (from github metadata)I'm not sure what the behavior of substation run
would be, but it might encapsulate the existing file and Kinesis development apps. Since deployments are decentralized and managed by Terraform, it's unlikely that it would ever support a command like substation deploy
.
Is your feature request related to a problem? Please describe.
When looking into adopting Substation as a potential solution, one of the pieces that made it difficult to commit fully was the absence of built-in unit testing to ensure that transforms function as expected prior to being deployed.
I was hoping to see unit tests that would allow me to write a transform for a given data source and on PR or as part of some sort of command line utility, execute that unit test to verify that my transform is in a working state. This test would ideally give a success or failure output right in the console and an exit code. This would help me to trust my changes as well as expedite the collaboration process with teammates.
Describe the solution you'd like One of the components that I think could really enhance adoption and utility of Substation, is integrated support for unit testing transforms.
What I would hope to see is the ability to specify a unit test very close to the transform itself, whether in the same file or a neighboring file. where you could assert things like field
foo
matches stringbar
, fieldfoo
is number - or any of the other conditional statements that are baked into the substation transform library could be used as validation checks to guarantee the output field matches the assertion.Some ideas for what this could look like might be from chatting about this on slack (100% @jshlbrd's cool idea here):
I like that the data, tests, and transforms are all baked into a singular file making the connection between tests and their business logic very transparent.
Where my idea might deviate from this is that I would hope to see this functionality built into a single substation binary. I would prefer to be able to write
substation test *.jsonnet
directly in my CLI and also be able to writesubstation build *.jsonnet
to build a substation app orsubstation run *.json
to run it (this could be opened as a separate issue, since I know this deviates from how substation works. But a central, vended, entrypoint for my substation workflow including unit testing mentally clicks a lot better in my head than a dedicated unit testing CLI .)Describe alternatives you've considered
The alternatives I've considered are primarily built into other tools. A great example is Vector's built-in support for unit testing with their VRL language. In Vector, you can easily map an input source, a transform, and an output destination for your unit test. If the input data matches your output assertions, the unit test passes.
This feature allows you to create detailed, comprehensive unit tests that validate the presence and structure of each expected field. It is also run through the same executable that is used to run the pipelines, just a different CLI entrypoint - making the workflow super easy.
Additional context https://vector.dev/docs/reference/configuration/unit-tests/
Example of a unit test in Vector that validates the existence of fields from the incoming data.