backstage / backstage

Backstage is an open framework for building developer portals

https://backstage.io/

Apache License 2.0

28.08k stars 5.94k forks source link

[RFC] Next Steps for the Scaffolder #2771

Closed benjdlambert closed 3 years ago

benjdlambert commented 4 years ago

Status: Open for comments

Background

Hey :wave:

So as you might be aware, we released our MVP for the Scaffolder a few moons ago, and internally have been overwhelmed with the amount of contributions and interest it has received.

Naturally, we've seen a lot of requests for features, and have slowly been seeing an increase in use-cases, some that might not have been considered in the original architecture of the Scaffolder.

If you haven't already seen the existing RFC about Extending the Scaffolder I'd recommend you go and do that before commenting on this RFC.

Problem

Looking at the previous RFC, I read through and gathered some key features and behaviour that the current Scaffolder implementation might not be able to support.

Running on existing repositories, or code that exists and creating PR's for the changes
Composing templates together (#docs-as-code) usecase
Being able to add more steps to the jobProcessor
Adding bots, integrations and webhooks to repositories
Adding branch protections to created repositories
Requiring reviews from some integrations on the repositories

Although not our original plan to gather feedback around the scaffolder, thank you so much to the contributors that reached out in this RFC to list their requirements, ideas, and what they thought was missing. Your feedback is always welcome and helps us all on the journey to make a better product!

After I grouped the feedback into the above list, I think there's a way forward.

Opening up the ability to add more steps to the jobProcessor (3) would enable users to write custom steps that run after repository creation and pushing. This solves, in turn, the last three points. Of adding integrations, webhooks and setting up things after the repo has been created, (4, 5, 6)

Another idea for improvement to the scaffolder, is the composition of code from one step to another. If we treat the jobProcessor as something similar to Github actions workflow, where it assumes nothing about the steps, just that you start with an empty work directory, and that same directory is shared throughout each step we can in turn solve the other points (1, 2).

Imagine that Templates now become the definition of these steps. That you can then describe the steps explicitly in the .yaml, similar again to github actions.

These steps can decide to pull data from other sources, like existing repositories, or even run additional logic for templating cookiecutter templates as part of the workflow.

Then you can start to compose things into the current directory and what you're left with is the end result.

<tldr /> Here's want we want to fix with the scaffolder.

We want to be able to add steps to the jobProcessor and remove the fixed steps

https://github.com/spotify/backstage/blob/55542797a9cbb156792774c502b44155c985fa2a/plugins/scaffolder-backend/src/service/router.ts#L97-L139

We want to be able to compose steps together, include sources that are external like other repositories, or be that other templates. Each template right now only has one source, or one skeleton, which doesn't work for a lot of use-cases.

Solution

I think that the solution has many steps. I'm cautious of proposing an API change too early, that we haven't even proved yet. So this proposal is twofold.

First off, I think the first step is to decouple the template definition with the skeleton or source of the template. It's become clear that sometimes templates have more than one source, and it's become clear that templates are not re-usable when the source or skeleton is treated as the same thing.

I'm thinking that the templates that you have available to pick from, could be different variations of workflows rather than being tied to one source, one transform, one publish.

Some will just be simple of taking a source and republishing, some will template some source with cookiecutter, some might take data from multiple sources and using both cookiecutter and handlebars create one end result, it doesn't make sense that the Template provides only one source of data. You should have a single source of truth for source files (skeleton) that you can re-use between templates.

Right now, the skeleton for something like a cookiecutter template must be co-located with the initial definition of the Template Definition using the path key.

So we could add a new field, which would reference the source required for the cookiecutter template that we're currently going to define. Something like skeleton: github:https://github.com/spotify/some-template-here. However, I also think that this is going to become an anti-pattern. I think that separating the source and the type from the initial definition is something that would become its own entity at a later stage.

I'm proposing that we change the Template kind to become something that can solve our original 2 problems.

apiVersion: backstage.io/v1alpha1
kind: Template
metadata:
  name: react-ssr-template
  title: React SSR Template
  description: Next.js application skeleton for creating isomorphic web applications.
spec:
  steps:
    - name: Fetch React SSR Template Source
      fetch: "github:https://github.com/spotify/backstage/path/to/template/source"
      ref: $SSR_SOURCE
    - name: Fetch mk-docs Template Source
      fetch: "github:https://github.com/spotify/techdocs-core-template"
      ref: $TECHDOCS_SOURCE
    - name: Template core template
      invoke: cookiecutter
      args:
        input: $SSR_SOURCE
        output: .
    - name: Add techdocs
      invoke: cookiecutter
      args:
        input: $TECHDOCS_SOURCE
        output: .
    - name: Publish to Github
      invoke: "publish:github"

There becomes a very explicit definition of the steps for the jobProcessor in each Template definition.
Each template is now no longer tied to source code, the template can decide to pull data from many sources and extract it to a reference which can be used throughout the template.
fetch becomes a wrapper around some source somewhere with maybe the authorization built in so you don't have to deal with that.
The previous Templaters and Publishers just become executable functions which you can invoke using the invoke key, referencing something that has been defined with the scaffolder at startup.
- Something like:
```
scaffolder.register(
'cookiecutter',
async ({ step, entity, args: { input, output } }) => {
/*some cookiecutter logic here*/
}
);
```

benjdlambert commented 4 years ago

@spotify/backstage-core @spotify/techdocs-core i'd be interested on your thoughts. This is what i have so far from various meetings and some thinking. API is not final, definitely going to change with some recommendations here.

mfrinnstrom commented 4 years ago

Really interesting direction for the scaffolder and I think it would solve most of our use-cases. Some comments/questions from me:

What about the input schema that we could specify before and then have the UI prompt the user for?
- I still think that there is a need to have a "contract" for a template with properties that the user should input.
- We would like to be able to reference those properties in the template as well, either for cookiecutter or for the publish step to set the repo name. I'm thinking along the lines of the context object in AWS Step Functions.
- I'm not sure how this would be handled between the repo with cookiecutter.json and the template. How do we make sure that all required properties in cookiecutter.json are asked to the user?
If we only want to fetch files from a repo and not invoke a templater on them. Would that mean invoking a "move function" in this case or could we perhaps specify a path in the fetch step?
Could there be a way for an invoke step to fetch information from an external source and make that available in the context as well to have that available to the following steps? This would allow us to fetch information (AWS account ids perhaps) and use them later on in the template.

But as I said this would make it quite easy for us to add our company specific actions as steps in the templates. I had a question about events in the previous thread and if that doesn't become a core thing for Backstage we could add that ourselves using this flow.

benjdlambert commented 4 years ago

Ok - a lot to run through here @mfrinnstrom thanks for the comments!

What about the input schema that we could specify before and then have the UI prompt the user for?

So off the bat, what I think we would head down the route of its being able to define the json-schema inline for now for each step and then collect them all into the main form. Condensing the duplicate fields, and maybe have a value map too so that you can map input values from other parts of the form to different values in the steps.

Something like this:

apiVersion: backstage.io/v1alpha1
kind: Template
metadata:
  name: react-ssr-template
  title: React SSR Template
  description: Next.js application skeleton for creating isomorphic web applications.
spec:
  steps:
    - name: Fetch React SSR Template Source
      fetch: "github:https://github.com/spotify/backstage/path/to/template/source"
      ref: $SSR_SOURCE
    - name: Fetch mk-docs Template Source
      fetch: "github:https://github.com/spotify/techdocs-core-template"
      ref: $TECHDOCS_SOURCE
    - name: Template core template
      invoke: cookiecutter
      schema: 
        required:
          - component_id
          - description
        properties:
          component_id:
            title: Name
            type: string
            description: Unique name of the component
          description:
            title: Description
            type: string
            description: Help others understand what this website is for.
      args:
        input: $SSR_SOURCE
        output: .
    - name: Add techdocs
      invoke: cookiecutter
      schema: 
        required:
          - docs_name
        properties:
          docs_name:
            title: Name for Documentation
            type: string
            description: Name for the documentation
          description:
            title: Description
            type: string
            description: Help others understand what this website is for.
      args:
        input: $TECHDOCS_SOURCE
        output: .
    - name: Publish to Github
      invoke: "publish:github"

Or alternatively, we could define them at the root level like they are today, but then that causes some problems when we want to move the actually split apart the templating parts into re-usable steps it's hard to do that from when it's all defined at the top level.

If we only want to fetch files from a repo and not invoke a templater on them. Would that mean invoking a "move function" in this case or could we perhaps specify a path in the fetch step?

So I was thinking that the ref variable would be available in the context. You could either call out to a mv function in a node function, or maybe we could add another step method called sh and you could do something like this?

steps:
   - name: Fetch React SSR Template Source
      fetch: "github:https://github.com/spotify/backstage/path/to/template/source"
      ref: $SSR_SOURCE
   - name: Move to current directory
      sh: 'mv $SSR_SOURCE ./something'

Also not sure if just providing another option to the fetch step would also work too that you could specify the directory to move it into rather than tmp for example?

Could there be a way for an invoke step to fetch information from an external source and make that available in the context as well to have that available to the following steps? This would allow us to fetch information (AWS account ids perhaps) and use them later on in the template.

Yeah I still think that we need the ability to be able to return things from these functions that also get added as context, not sure yet how that would change from the current usage, but thinking that it could work as it does today in the jobProcessor

mbruggmann commented 4 years ago

While I appreciate the flexibility of the proposed model, I'm wondering if we could get away with something a bit more constrained if we split creating repositories from composable software templates? As I haven't dug into the use-cases or the current implementation too much, I might also be missing some obvious thing but was thinking something along those lines:

A way to create an empty repository from Backstage. We can either have some hook in Backstage that allows additional plugins to run at this point, or call a (configurable) endpoint in another service, or manage that completely outside Backstage by a dedicated service that is listening to the Github/GitLab/... events for new repos instead (for example, because we also allow creating new repos elsewhere and the same logic needs to apply everywhere).
A way to scaffold one or several software templates into a repository as a PR. Every template might require some input from a form (eg through a schema as you suggest). We can also call a hook in Backstage, and maybe an external (configurable) endpoint, to extend the context. Since templates are composable, it should be fine to build a docker container each that already has the source inlined.

This would essentially flatten every template to just have one step. It would also mean that the templates can't refer to the working directory or to other templates running at the same time, but that might be a feature? Otherwise it might be hard to tell which templates are applicable where, in what order, with what inputs etc.

Of course, those two steps could still be combined in the UX. Step 1: Select an existing repo or create a new one. Step 2: Select one or several templates to scaffold into the repo. Step 3: Click the link to go to the generated PR, review and merge it.

mfrinnstrom commented 4 years ago

@benjdlambert thanks for the clarifications! One thing I'm not sure about though.

Or alternatively, we could define them at the root level like they are today, but then that causes some problems when we want to move the actually split apart the templating parts into re-usable steps it's hard to do that from when it's all defined at the top level.

Would this mean you can reference one template from another one or how would this look? My thinking was that I didn't have to register all the repos that I use in a template in Backstage but I guess I'm missing something here.

Regarding the fetching of things without running a templater on them I meant something like this.

steps:
   - name: Fetch React SSR Template Source
      fetch: "github:https://github.com/spotify/backstage/path/to/template/source"
      path: ./something

I probably won't need to reference that then. Maybe fetching could just be a function to invoke as well?

benjdlambert commented 4 years ago

@mfrinnstrom I think it's better practice that the steps context only include the form values that are presented in the schema but maybe special keys like ref's could be passed through.

fetch will go and grab the contents and extract it somewhere, the you could set the ref as $SOME_SOURCE then the context would have some way to reference this path somehow.

And then the $SOME_SOURCE variable can be used in the yaml.

fetch could be just an invoke, but it could be a pain to manage the authorization in these functions too where as we could take care of it as a first class citizen like the Service Catalog does.

I probably won't need to reference that then. Maybe fetching could just be a function to invoke as well?

I can still see a use case where you might want to reference that path later on for some reason. I'm saying it's better to have the choice?

benjdlambert commented 4 years ago

@mbruggmann I'm thinking that the current implementation that we have is very similar to what your proposed solution is. You can define template which has some skeleton and a templater, and then the publish step is statically defined right now from the frontend. It works pretty well, but there's been an increase in requests for flexibility and more customisation, so I think inverting the control back to the end-user than trying to be opinionated is a better option.

Purely because I think there is no one size fits all for every company.

I am firmly in the camp of that it should be at least somewhat opinionated, and it shouldn't just become a new Github Actions even though it looks similar, it should just be for Creating or Updating a repository at the end but where that lies is up to the implementer.

This would essentially flatten every template to just have one step.

This has also come up that it can be really hard to see what failed and why, and it would be a better user experience to break down these things into clearer sections in the frontend. I think it could be pretty hard to do that without having these explicit steps.

andrewthauer commented 4 years ago

Running on existing repositories, or code that exists and creating PR's for the changes

This is one of the more compelling ideas for us in this RFC. We already have a large established set of repos so making changes across them would be very useful. Especially if it was possible to do as a bulk change across multiple repos.

As an example, if we wanted to increase the entity schema version across all of the catalog-info.yaml files with each repo. Another might be adding a docs-like-code setup for existing repos.

Rugvip commented 4 years ago

@andrewthauer A constraint we put in place for now is to only consider user-initiated flows, i.e. no dependabot-type things.

Still very much possible that we can support what you suggest though, but bulk changes across repos will require a bit of creature feep.

@mbruggmann Splitting things out into a PR-driven flow makes it really flexible, but I do worry a bit about the DX of having to sit and click through 5 different templates and creation steps. My big concerns is where that puts the flexibility though, I think we're searching for a solution that provides flexibility for the integrators and template creators, i.e. the people that are running an organization's instance of Backstage. We're not looking for flexibility for the end user, with the exception of some knobs that are deliberately put in place by the template creators.

@benjdlambert Regarding the schema I'm kinda leaning towards having the yaml structure decide how to run different steps with different input/output. Probably a recurse schema with some limitations. What are your thoughts on something like this?

spec:
  schema:
    properties:
      monitoring:
        title: Monitoring Bundle
        type: boolean
        description: Checking this will also create a Grafana dashboard for your service

  steps:
    - name: React SSR Template Source
      # The output of a template step is merged into the outer working directory
      template:
        - name: Fetch React SSR Template
          fetch: github.com/spotify/backstage/path/to/template/source
        - name: Run Cookiecutter
          invoke: cookiecutter
          schema: 
            properties:
              name:
                title: Name
                type: string
                required: true
                description: Unique name of the component
              description:
                title: Description
                type: string
                required: true
                description: Help others understand what this website is for.

    # Adds TechDocs with a suggested documentation structure, but no additional templating
    - name: TechDocs Template Source
      template:
        - name: Fetch TechDocs Template Source
          fetch: github.com/spotify/techdocs-core-tempalate

    - name: Setup Monitoring
      invoke: "monitoring" # Custom invoke added by the org
      if: '.monitoring' # Some basic conditionals mapped from user input

    - name: Publish to Github
      invoke: "publish:github"
      schema:
        properties:
          repoSlug:
            title: Repo Slug
            type: string
            required: true
            component: GithubRepoSlug # Mapped to a custom input component on the frontend
            description: The GitHub <org>/<repo> for this component.

Top-level properties of each step in this format:

name - The name of the step, displayed to the user.
invoke - Invokes a custom action that is registered in the scaffolder backend.
fetch - Fetches the contents of a URL.
schema - JSONSchema that will be merged into the parent schema. The values of all properties defined by the schema will be handed to each invoked step.
if - Conditional that decides whether this step should be executed or not.
template - A sub-template, mostly recursive but does e.g. not allow publishing. After all steps have completed in the sub-template, the contents of the working directory will be added to the parent working directory.

benjdlambert commented 4 years ago

@Rugvip I like this.

From what I remember, the required part of json schema however is a little different to what you have listed here.

It might like something like the following:

spec:
  schema:
    properties:
      monitoring:
        title: Monitoring Bundle
        type: boolean
        description: Checking this will also create a Grafana dashboard for your service

  steps:
    - name: React SSR Template Source
      # The output of a template step is merged into the outer working directory
      template:
        - name: Fetch React SSR Template
          fetch: github.com/spotify/backstage/path/to/template/source
        - name: Run Cookiecutter
          invoke: cookiecutter
          schema: 
           required: ['name', 'description']
            properties:
              name:
                title: Name
                type: string
                description: Unique name of the component
              description:
                title: Description
                type: string
                description: Help others understand what this website is for.

    # Adds TechDocs with a suggested documentation structure, but no additional templating
    - name: TechDocs Template Source
      template:
        - name: Fetch TechDocs Template Source
          fetch: github.com/spotify/techdocs-core-tempalate

    - name: Setup Monitoring
      invoke: "monitoring" # Custom invoke added by the org
      if: '.monitoring' # Some basic conditionals mapped from user input

    - name: Publish to Github
      invoke: "publish:github"
      schema:
        required: ['repoSlug']
        properties:
          repoSlug:
            title: Repo Slug
            type: string
            component: GithubRepoSlug # Mapped to a custom input componen

benjdlambert commented 4 years ago

@Rugvip how would we pass variables into the different invoke parts?

mfrinnstrom commented 4 years ago

Some comments/questions from me.

I think I would prefer having all properties specified at the top level instead of for each component but that may only be because then it's easy for me to get an overview of the "interface" for the template. With this approach it is easier to just add another property though, I don't have to care about any mapping to the inputs.
- One thing with having it separated like in the example is what should we do if we have a name property in many places but for some reason they end up with different titles or descriptions.
- Another thing. The repoSlug property is decided by the publish:github function and we can't really change that right? What if another function decides to call it repo_slug? We can't change any of them and we end up with both of them prompted to the user. I guess this ties back to @benjdlambert question about passing variables into the invoke parts.
With the techdocs example, where would that be added, to the root? In that case what would happen with the SSR template, would that be added to the root as well and what would happen when cookiecutter is run? It usually has one source folder and produces it's output in another one.
I'm liking the support for conditional parts. One nit thing here though. I would prefer if the properties where "anchored" to something, so $.monitoring instead of just .monitoring (or perhaps just monitoring).
The component part of the publish to GitHub step made it clear to me where we would extend this to support our custom selections and pre-filled values. This is something I imagine won't change much between templates so I thought maybe this could be configured when registering the publish:github function in the backend but then I don't see a clear way to reference that in the templates.
We have talked about functions that fetch values and make them available to other parts of the template. How would that look in this case? It might be that the need for that has been removed with the component part so that that gets handled up front and the user can actually see the value before proceeding.

mfrinnstrom commented 4 years ago

Our Backstage app will be running using ECS Fargate on AWS. I'm now looking into running the templater step of the scaffolder using ECS Fargate as well instead of a local Docker instance. To make it easy to share the prepared files (that are downloaded in the Backstage container) with the templater step that will run in a separate container we intend to use EFS and mount a shared filesystem between the containers.

stefanalund commented 4 years ago

@mfrinnstrom it would be really interesting if you helped write down how to get Backstage up and running on Fargate, so that other can follow the same model. Maybe in the form of a tutorial? Example https://backstage.io/docs/tutorials/quickstart-app-auth

mfrinnstrom commented 3 years ago

First, @stefanalund I'm sorry for missing this. I will see if I can extract at least the CloudFormation template that we have for testing right now (it won't be production ready) and some short description on how to use that.

@benjdlambert & @Rugvip I guess you are discussing this internally but we had some discussions around this on our end yesterday and I thought I should at least try and document our thoughts.

I still like the idea of a clear "contract" for this specific template at the top but then you will have to map those to each function invocation but I think this is something that needs to be supported anyway.

In the example below everything is a function that you invoke, there are no built-ins. I, as an operator, would have to configure all the functions that I would like to have available in my instance. This is similar to how it is today where we have to wire up the scaffolder with the templaters, preparers and publishers that we want to have.

Each function would have the same interface, it takes an JSON object as input and returns an JSON object as output. There will of course be different required values for each function. To make it known during runtime what functions (and their inputs and outputs) are available I'm thinking something similar to what has been done with the config schemas. This could perhaps be shown in a Help page for the scaffolder. I guess there could be some sort of validation functionality for the templates as well against this. This is something that we would want to run during our CI/CD pipeline for the templates then if possible.

We also introduced the id for each step to be able to reference the output of a previous step and use that as parameters for another step.

spec:
  parameters:
    dataSchema:
      type: object
      required: ['name', 'description', 'repoSlug']
      properties:
        name:
          title: Name
          description: Unique name of the component
          type: string
        description:
          title: Description
          description: Help others understand what this website is for.
          type: string
        repoSlug:
          title: Repo Slug
          type: string
          component: GithubRepoSlug # Mapped to a custom input component
        monitoring:
          title: Monitoring Bundle
          description: Checking this will also create a Grafana dashboard for your service
          type: boolean
          default: true
    uiSchema: # If there is a need to do some custom config of the UI

  steps:
    - id: network
      name: Lookup network config
      invoke: custom:networkLookup # Should we namespace functions? Maybe not if there should be no built-ins

    - id: template
      name: Some infrastructure template
      invoke: template:cookiecutter
      parameters:
        source: github.com/spotify/backstage/path/to/template/source
        destinationPath: ./
        variables:
          name: $.parameters.name # Maybe we could use JMESPath or something else
          description: $.parameters.description
          cidrBlock: $.network.cidr # References the previous step using its id

    # Adds TechDocs with a suggested documentation structure, but no additional templating
    - id: fetch-docs
      name: TechDocs Template Source
      invoke: backstage:fetch
      parameters:
        url: github.com/spotify/techdocs-core-template
        destinationPath: docs/

    - id: monitoring
      name: Setup Monitoring
      invoke: custom:monitoring # Custom invoke added by the organization
      if: '$.parameters.monitoring' # Some basic conditionals mapped from user input. Maybe name this `condition`?

    - id: publish
      name: Publish to Github
      invoke: publish:github
      parameters:
        source: ./ # If you want to publish only a part of the working directory
        repo_slug: $.parameters.repoSlug

So that more or less sums up our thoughts right now 😄

errolpais commented 3 years ago

New Feature Request: Update Scaffolder-backend-plugin to support multiple instances of Backstage Backend

Observed Issue: When we have two instances of the backend, frontend requests to check the status of a running scaffolder job (via the GET /v1/jobs/ endpoint) may get directed to a backend that is not aware of the created job resulting in a 404. This causes the status modal for the scaffolder stages in the UI to flicker until the job is completed.

Potential solution: This could be solved by persisting the state of a job in the DB

Rugvip commented 3 years ago

@errolpais Yep, definitely needs a fix and is in scope!

freben commented 3 years ago

@mfrinnstrom Interesting, thanks for writing it up so thoroughly. I wonder if template "functions" could be catalogued too, registered as a TemplateStep kind with an according schema for its inputs and outputs as you mention.

freben commented 3 years ago

One drawback with the explicit parameter passing approach, combined with detaching template "workflows" from cookiecutter template repos, is that there will be a ton of mechanical parameter passing. If the cookiecutter template needs 20 parameters, you'll have to declare and pass them all in every workflow that uses it.

I also wonder how to best solve development and versioning. If you want to add or remove an input parameter in your cookiecutter template repo, how do you test and deploy that change together with updating and re-registering the different workflows that make use of it? 🤔

mfrinnstrom commented 3 years ago

@freben Interesting idea with the TemplateStep. I guess there will be some sort of source code connected to some of them that needs to be executed. Maybe you only mean to have them registered for the schema though?

I see your point about the parameters and I agree that it could be lots of parameter passing. I can see the beauty in having the schema for a step defined with the cookiecutter template and then not having to specify them for every template.

My concern though is when you have multiple of these steps in one template and one calls it component_id, another component-id and the last one repo-name. In practice we want the same value for them but due to their different names the user will be prompted for three different values. I would at least be a little confused/irritated by that. Not sure if JSONForms has a solution for this.

One possible solution for it could be aliasing of parameter names for specific step but then we are almost back to parameter passing. I'm not sure what would happen if different steps declare that a parameter should be handled by different components either.

Good point about versioning. My immediate thought is that a template could point to a git tag (or branch?) to have a stable reference. Then you can continue to develop in your master branch and when a new version is ready you tag it and then the templates can be updated when ready. I'm not sure if I'm totally sold on that solution though.

freben commented 3 years ago

No you are right, I was thinking that the TemplateStep could define (relevant parts of / references to) the actual implementation of the step in addition to the schema, and can be of several types. If you look at how GitHub actions are defined, they have three main types of action: docker, javascript and composite. I think for our scaffolder, we may want to initially support just one type - one that calls a function which the backend itself has to supply into a map that's given to the engine.

// in the backend package
const tmpdirV1: NativeFunction = ({ outputs }) => {
  const path = await fs.mkdtemp(path.join(os.tmpdir(), 'scaffolder-');
  outputs.set('path', path);
  return async () => { await fs.rmdir(path); } // support for cleanup?
};

const readTreeV1: NativeFunction = ({ context, inputs, outputs }) => {
  const { workDir } = context;
  const { sourceUrl, targetPath = '' } = inputs;
  const targetDir = path.join(workDir, targetPath);
  // mkdirp, then use the existing UrlReader for readTree etc
};

const engine = new ScaffoldingEngine({ nativeFunctions: { tmpdirV1, readTreeV1 } });
const router = createRouter({ engine })

apiVersion: backstage.io/v1alpha1
kind: TemplateStep
metadata:
  name: tmpdir-v1
spec:
  type: native   # Other types could be envisioned here - docker for example, or bash (run locally)
  uses: tmpdirV1 # A function name, as given to the engine
                 # (or a docker image name, if this were a docker type step, etc)
  outputs:
    path:
      description: The full path to a newly generated (unique, empty) temporary directory

apiVersion: backstage.io/v1alpha1
kind: TemplateStep
metadata:
  name: readTree-v1
spec:
  type: native
  uses: readTreeV1
  inputs:
    sourceUrl:
      required: true
      description: The full URL of the root of the tree to read
    targetPath:
      required: true
      description: The path to store the resulting tree in - either absolute (if a temp dir) or relative (to the workdir)
  outputs:
    path:
      description: The full path to a newly generated (unique, empty) temporary directory

apiVersion: backstage.io/v1alpha1
kind: Template
metadata:
  name: default-fetch-cookiecutter-repo-v1
spec:
  inputs:
    # ...
  steps:
    - name: temp 
      uses: tmpdir-v1 # This is actually an entity ref, amounts to templatestep:default/tmpdir-v1
    - name: get
      uses: readTree-v1
      params:
        sourceUrl: https://github.com/my/templates/the-template
        targetDir: ${steps.temp.dir}
    # ... etc

This is not a fully formed example, but it's what I had time to type out now :) I kinda like that it ends up using entity references.

JacobValdemar commented 3 years ago

When will the work on this be started, or should we have another RFC where we collect the findings from this RFC in one (or multiple) proposed solutions? 🙂

benjdlambert commented 3 years ago

Apologies for the delay for an update on this, but we've split out the work into a milestone here https://github.com/backstage/backstage/issues?q=is%3Aopen+is%3Aissue+milestone%3A%22Scaffolder+out+of+Alpha%22.

It's very high level implementation wise, and they've been broken out into some form of epic or focus area.

Evaluating this RFC and the ideas that come with it is one of our higher priority tasks for the new year, as It's something we're focusing on in Q1, so we will shortly move the discussion around the different areas from here, into the tickets in the milestone.

I'm going to close this RFC for now, and thank you all for the feedback so far. We'll be updating the tickets in the milestone with a little more detail over the coming weeks with our suggested path, would love to hear feedback there too.

jbadeau commented 3 years ago

I think the workflow approach is pretty interesting and powerful. Just for your info, the jhipster studio is capable of doing a lot of what you guys are asking for (lifecycle, hooks, composable steps, extensions, git repo creation, etc), maybe you can get few tips from them.

On another note, I am one of the creators of the JHipster IDE plugin so I would willing to work on IDE tooling (editor, code highlighting, code completion) once the grammar is stable

regicsolutions commented 3 years ago

@jbadeau jhipster studio looks supper interesting wondering if you played around with creating a scaffolder backend module like the yeoman module? https://github.com/backstage/backstage/tree/master/plugins/scaffolder-backend-module-yeoman

jbadeau commented 3 years ago

I have not had the time too look into the new backstage generator but the syntax looks pretty similar to workflows like tekton, Argo, etc. Reusable steps composed into a dag where each step provides a schema for ui generation and validation. Pretty cool.

bobharwood commented 1 year ago

I have not had the time too look into the new backstage generator but the syntax looks pretty similar to workflows like tekton, Argo, etc. Reusable steps composed into a dag where each step provides a schema for ui generation and validation. Pretty cool.

@jbadeau Any more thoughts on integrating JHipster Lite and Backstage? The combination of selecting an architecture and visually selecting the deployment infra would be powerful.

In short, it seems like there's a ton of overlap in the problem to be solved across Backstage Templates and JHipster Lite (e.g., auto-generation and auto-provisioning) IMHO, JHL has a better UI because it presents the dependency chain visually. The ideal would be to create a JHL toolkit that allows us to define an architecture template and its dependencies, then auto-generate a JHL like UI for the teams to use.