cmd, ci, docs: Add test script for get-started.md

patzielinski commented 1 month ago

This PR adds a test script to test the commands in docs/get-started.md, as suggested in #394. (Thanks @lukpueh!)

At the moment, this checks that the return codes of the commands tested are all 0's, and raises an error if a non-zero code is returned. I can rework this to match parts of the output if we think that's better/needed (parts since output is variable in places).

neilnaveen commented 1 month ago

Don't we already have somthing like this in the gittuf demo repo? Would we be able to update that demo repo with something like this, since currently, the demo repo is not in sync with the latest version of gittuf, and many things are failing if we do run it with the latest version of gittuf?

patzielinski commented 1 month ago

Yeah, the demo might not be up to date w.r.t. how to invoke gittuf so I'll be checking that out as well as writing a tester for the demo itself. This PR is to test the commands in the getting started doc, which more people will likely see compared to the demo.

adityasaky commented 1 month ago

cc @lukpueh

patzielinski commented 1 month ago

@lukpueh The output for the commands is variable due to the --verbose flag here which invokes the logger output (although that might go to stderr instead...) I can update it to match on part of the text but any change in commands or command output requires updating the script with the expected text anyways.

Maybe we can add an expected output text file that the script loads and compares to the terminal output so we can just update that file instead and have the script extract commands from the markdown file.

lukpueh commented 1 month ago

any change in commands or command output requires updating the script with the expected text anyways.

Sure, but CI will fail if the actual output doesn't match the expected output anymore.

lukpueh commented 1 month ago

Maybe we can add an expected output text file that the script loads and compares to the terminal output so we can just update that file instead and have the script extract commands from the markdown file.

Yes, that's what the script referenced in the issue does.

adityasaky commented 1 month ago

timestamps in the debug messages may be a small blocker for that? Perhaps this is a good time to pretty that up as well, though I think it's low priority...

patzielinski commented 1 month ago

timestamps in the debug messages may be a small blocker for that? Perhaps this is a good time to pretty that up as well, though I think it's low priority...

This is what I meant about variable output, as the timestamps prevent checking output equivalence. I can just disable the verbose output and match on the small amount of output we have anyways...

Maybe we can add an expected output text file that the script loads and compares to the terminal output so we can just update that file instead and have the script extract commands from the markdown file.

Yes, that's what the script referenced in the issue does.

A question though, is it better to have the output embedded in the script or as a separate file? The script referenced appears to do the former.

lukpueh commented 1 month ago

This is what I meant about variable output, as the timestamps prevent checking output equivalence. I can just disable the verbose output and match on the small amount of output we have anyways...

Ah right, it's harder if timestamps are in the output. There are several ways to go about this:

change gittuf to not print timestamps
run the test with something like faketime
strip timestamps from the output in the tests

A question though, is it better to have the output embedded in the script or as a separate file? The script referenced appears to do the former.

Whatever you prefer. If it's short enough, like in the in-toto demo, I think it's fine to just add it to the script. If it's a lot longer a separate file is probably better for readability of the script.

patzielinski commented 1 month ago

I tinkered a bit and I've transplanted much of the in-toto script to this one. It scrapes the commands properly, but there's quite a couple of issues:

In addition to timestamps causing variable output, hashes also present an issue as they vary with each time a repository is created / committed to.
- Removing timestamps from gittuf's logging removed the time variable output.
- Hashes in gittuf's log output were worked around by outright disabling verbose output.
- Commands that had variable output (i.e. git displaying commit hashes or ssh-keygen displaying randomart were silenced with -q)
A good place to run this script (imo) would be in the testing CI so that every time a change is made we see if we break the guide.
- The testing CI runs on 3 OSes and the command format appears to be problematic given that each one has its own way of doing things (see most recent CI run - it passes locally on Ubuntu but fails on GH's Mac instance)

I don't think diffing the output is a great idea as we have too many variable outputs that are useful for the end-user but make testing complicated.

Additionally, get-started.md has a good amount of content that had to be ignored in this script due to being problematic (installing "old" gittuf versions or incomplete commands that need users to complete them)

If we want to run this as a part of CI to keep checking whether the guide commands work or not, I think manually keeping this tester in sync is the better option.

adityasaky commented 1 month ago

Can we make hashes deterministic by controlling the author/committer details (including the time) and signing with an ed25519 key? You can pin those details using env vars.

patzielinski commented 1 month ago

Can we make hashes deterministic by controlling the author/committer details (including the time) and signing with an ed25519 key? You can pin those details using env vars.

I can try that, but do we want to update the commands in the markdown to match? This should be to test the commands as the user would run them with only minor changes imo...

lukpueh commented 1 month ago

Can we make hashes deterministic by controlling the author/committer details (including the time) and signing with an ed25519 key? You can pin those details using env vars.

I can try that, but do we want to update the commands in the markdown to match? This should be to test the commands as the user would run them with only minor changes imo...

Setting up custom author/committer details and a fake time shouldn't affect the testable snippets. Key type maybe, but I guess that's okay.

Btw. after #414, rsa keys will produce reproducible signatures too, because we'll no longer sign ourselves (with PSS), but ask ssh-keygen to sign, which uses (pkcsv1.5).

patzielinski commented 2 weeks ago

I've fixed up the testing code here and marked this as ready for review - it now scrapes output correctly as I've managed to make git's hashes reproducible. I couldn't do the same for gittuf's hashes, as faketime does not support golang programs. In this case, I've just disabled verbose output for the verification - gittuf prints out an error if it can't verify something which will get picked up by the script and I don't think it's worthwhile to add time hackery code to gittuf just for this. The tests are in their own .yml file and run on Ubuntu and macOS.

@lukpueh let me know if you have any thoughts / suggestions on how this looks at the moment.

adityasaky commented 1 week ago

@lukpueh could you take a look and approve if this is good to go? thanks!

adityasaky commented 1 week ago

Thanks @patzielinski!

gittuf / gittuf

cmd, ci, docs: Add test script for get-started.md #405