prerequisites - Githubissues

Pre-requisites has been on the backlog for IE for some time. I am consolidating the ideas with how prereqs can be solved in the short, medium, and long-term - each involving a more nuanced but much needed feature update on IE.

Note: This is for the CLI. Contract changes between IE and Portal would come after that.

Short Term

Ask on IE: Have the ability to transfer variables from one environment (this could be a doc) to another

Setup: parent doc references prereq doc 1, which reference prereq doc 2
We include ie execute <remote URL to the raw markdown file for doc 1> as a code block within the prerequisites section of the parent doc. And we include ie execute <remote URL to the raw markdown file for doc 2> as a code block within the prerequisites section of doc 1.
During runtime, Innovation Engine will execute the parent doc, encounter the execution of doc 1 and go to that doc, encounter the execution of doc 2 and go to that doc, run doc 2 E2E, then run doc 1 E2E, and finally run the parent doc E2E
Whenever IE executes an ie execute command inside any of these docs, its output would be shown in the CLI without any formatting. Furthermore, the variable names would need to be inferred from that IE output so that they can be carry forwarded into the parent doc.

Medium Term

Ask on IE (in addition to short term ask): Have the ability to execute the bulleted list in the Prerequisites section in the back-end and provide a custom output with information from that run

Setup is the same as the short-term scenario
We include a designated section that should be consistent across all exec docs and call it, say, Prerequisites. That section would (optionally) contain a description and (mandatorily) contain a bulleted list of pre-reqs docs (with their names and URLs).
IE would look for this section in the doc and run the prereq docs in sequence from it
IE would provide a custom output in the CLI showing the progress of running the docs, the variables they output, the state of those variables, etc.
During runtime, Innovation Engine will execute the parent doc, encounter the execution of doc 1 and go to that doc, encounter the execution of doc 2 and go to that doc, run doc 2 E2E, then run doc 1 E2E, and finally run the parent doc E2E

Long Tem

Ask on IE (in addition to short and medium term asks): Have the ability to validate if a prereq doc has already been run and transfer the necessary information from that to the parent doc if so

Setup is the same as the short-term scenario
The designated section and its functioning is the same as the medium-term scenario
There would be a validation section at the end of every doc that wants to be a prerequisite. This section should contain commands for IE to run to check if the infra promised in the doc's execution was actually deployed
In all such docs, IE would first check the presence of the validation section. If it exists, IE would check that first i.e. check if the doc has already run and the resources deployed. If so, it would store the details of the infra in environment variables and export them to the parent doc. Only if the validation section fails for some reason, IE will run the prereq doc E2E.
IE will convey the information of the variables to the user such that they can use them in the parent doc

Given prerequisites being high on IE's roadmap and multi-part exec docs needing prerequisites to fully work, I would love to get the ball rolling for this one and implement some working solution while we scope out an ideal one. What does the group think?

From @vmarcella

I agree with most of the high-level structure, but I think there are a lot of gaps that aren't addressed which need to be for the PoC. Let's imagine how the process should go for a doc that has any arbitrary number of pre-requisites and those pre-reqs also have any arbitrary number of nested pre-reqs:

IE checks for a pre-requisite section in the document.
IE finds pre-requisite documents and loads them in the order that they're specified in the bullet list.
IE checks each pre-requisite doc for a Validation section that contains code blocks (Also by header)
- If no validation section exists and there are no further pre-reqs in the pre-req doc we're currently checking, we just execute the pre-req doc as is.
- If no validation section exists and there are further pre-reqs in the pre-req doc we're currently checking, we load those pre-reqs and go back to step 2
- If validation section exists and there are no further pre-reqs in the pre-req doc we're currently checking, we run the validation section.
- - If validation fails, we execute the whole document.
- - If validation succeeds, we can skip executing the pre-req document.
- If validation section exists and there are further pre-reqs in the pre-req doc we're currently checking, we run the validation section
- - If validation fails, we start the process over at step 2 for this pre-requisite doc.
- - If validation succeeds, we can skip executing the pre-req document.
IE finishes running all of the pre-requisites
- If pre-reqs are satisfied, we begin the execution of the current document as normal.
- If pre-reqs are not satisfied, i.e., validation steps constantly fail or the pre-req doc steps constantly fail, we need to communicate the error

This logic makes the following assumptions:

Validation is responsible for exporting variables that would be needed by other documents which serve as the dependencies for other documents.
Validation is somewhat fast, taking no longer than 30-60 seconds per validation section on each doc.
There are no cyclic pre-requisites (I.E. Parent doc has pre-req doc which specifies the parent doc as a pre-req, creating an infinite loop. IE will catch this in a production ready version of the feature)

With this in mind, here are my questions:

If a pre-requisite doc was run multiple times, how should we choose which resources or values to use for the current document? Or if it's up to the user, how do we let them choose?
Following up on the last question, how should validation optimally find resources from previous executions of a document? What guidance should we provide to doc authors writing validation sections?
How do we prevent breakage that can occur from pre-req documents being updated? If a pre-req doc gets updated, and let's say no longer exports a variable in the validation section which was depended upon by the parent doc, how should this be handled? Can we prevent it?
If we have to execute multiple pre-requisite docs before executing the current document, that can potentially take 30+ minutes. What do we communicate to the user while that is happening? Would it be better to just have them execute those docs first and then come back to the current document if there are too many unsatisfied pre-requisites?
What should be communicated in the CLI while executing pre-requisite documents? If we include the entire output from pre-req runs, that will result in a ton of output.

From @rgardler-msft

For a first iteration it may make sense to say no nested pre-reqs. Not saying we should, just saying that taking an iterative approach may make it easier to get something out the door. Similarly, things like changes in the pre-req causing problems "upstream" is (probably) best left until it is an actual problem. I want to see all docs tested on a regular cadence; this means that such breakages would be automatically caught by the test framework and thus IE need only report the break, it shouldn't worry about detecting it.

Remember we want to move forward at a pace, even if that means needing to accept that sometimes we will stumble. Of course, I'm not saying we don't need to consider all possibilities, only that we should feel comfortable saying some things are currently out of scope but need to be considered in a future iteration. It should be SWE that have final say on what can safely be left out of scope.

From @naman-msft

I went through all the points mentioned below. Given that, I agree with @rgardler-msft and would love to get on with a first iteration of this. TL;DR we pursue what you outlined @vmarcella with one change - assume there are no nested pre-reqs.

here are my answers to your questions @vmarcella and my take on the path moving forward:

If a pre-requisite doc was run multiple times, how should we choose which resources or values to use for the current document? Or if it's up to the user, how do we let them choose?

If the prereq doc has a validation section that passes, then the resource values from running that validation section will be transferred. If the validation section fails, the values from that run of the prereq doc will be transferred. Allowing users to choose values (in the CLI experience) should happen via a parameter or equivalent in IE that allows you to filter for resource values or pass them as parameters for the prereq doc if it needs any. However, allowing users to choose parameters in the Portal experience would happen through the configurability parameters effort i.e. in the prerequisites section, for every doc requiring parameters, they would be displayed in Portal and parsed accordingly along with that doc.

Following up on the last question, how should validation optimally find resources from previous executions of a document? What guidance should we provide to doc authors writing validation sections?

How can we implement this from a dev standpoint? This was one of my asks on IE. I thought of this implementation from my end and here is one way we could do it:

This uses the command to query the RG to check if the list of resources match the expected result. Maybe we increase the expected similarity match to more closely match it or introduce something in IE that would also check if the number of elements in the expected similarity results list are the same as in the actual list. How would you solve it as an MVP?

How do we prevent breakage that can occur from pre-req documents being updated? If a pre-req doc gets updated, and let's say no longer exports a variable in the validation section which was depended upon by the parent doc, how should this be handled? Can we prevent it?

This would just be an error while trying to execute the updated prereq doc and should be caught by the testing process and flagged to the author for further remediation. That is the closest we can go to prevent it in the short term. However, why would the prereq doc no long export a variable in the validation section without it being caught in testing? Also, is there any more efficient ways of doing this? Is this even an issue to worry about right now? I don't know, let's experiment and find out.

If we have to execute multiple pre-requisite docs before executing the current document, that can potentially take 30+ minutes. What do we communicate to the user while that is happening? Would it be better to just have them execute those docs first and then come back to the current document if there are too many unsatisfied pre-requisites?

We would need the content folks to give their $0.02 on the verbiage since long wait times are not ideal. If the prerequisite docs are exec docs that have been deeplinked into Portal, we can prompt the users to run those docs first. But then how would state (the data at the end of the doc) be carried over to the parent doc at the end of its execution?

What should be communicated in the CLI while executing pre-requisite documents? If we include the entire output from pre-req runs, that will result in a ton of output.

This was captured in my ask for the medium run: "IE would provide a custom output in the CLI showing the progress of running the docs, the variables they output, the state of those variables, etc." Can IE detect its running a prereq doc and based on that flag, modify the display in the shell. On the Portal side, we can mock up something quickly for the same.

From @mbifeld

I agree that the first iteration does not need to support a pre-req in a pre-req for simplicity purposes.
In regards to what IE displays when executing a pre-req, something like:
- When it finds a pre-req, it outputs "Checking pre-requisites..."
- - If validation steps pass, continue to step c.
- - If validation fails, it outputs "A pre-requisite Exec Doc needs to be executed. Starting execution..." A later experiment can determine if its better to ask the user if it should execute the pre-req doc first or just do it.
- During execution of the pre-req, IE outputs the same as if it were a regular Exec Doc. If we're not doing pre-reqs in a pre-req, there shouldn't be too much output.
- After all pre-reqs have been met, we output "Pre-requisites have been met"
In regard to choosing the resources to validate against (such as in the case where the pre-req was run multiple times), we should allow for the top-level Exec Doc to have environment variables that are initialized prior to the pre-req validation being ran. Essentially a "do this first no matter what". The author should ensure the variable names line up with those in the pre-req. In the screenshot you shared @Naman, $RESOURCE_GROUP_NAME would have to be defined before the validation step happens. Something like:

```bash
RESOURCE_GROUP_NAME=abc

We could alternatively have this initial declaration section occur in the pre-req doc. In this case, we execute 'initial' and then test 'validation' for the pre-req. But I feel that putting the variable declaration in the top-level doc allows for more flexibility for the author.

If a pre-req gets updated and breaks, the top-level doc will also break. It's up to the authors to fix, and us to report.

Azure / InnovationEngine

prerequisites #236