Closed abejgonzalez closed 1 year ago
Currently, there are issues with running a screen
session. See "Must be connected to a terminal" in https://github.com/firesim/FireMarshal/actions/runs/3298904128/jobs/5441742036#step:3:70
For more context:
This PR is adding CI to run FireMarshal's fullTest.py
script which should run all the default FireMarshal test suite. Under the hood this fullTest
script is running marshal test <...>
for a variety of workloads found in test/
. The current issue is that this marshal test <...>
command creates a screen session (see ref 1) that can't run in the current GH-A shell. This is the error that is seen: https://github.com/firesim/FireMarshal/pull/254#issuecomment-1287239154. I don't know if this issue is due to needing to call subprocess.POpen
with the shell=True
flag or I need to somehow change the default shell of GH-A to allow for screen sessions. If this issue (screen can be called within marshal test ...
within a GH-A shell, then I expect the fullTest.py
to work and for us to have a functioning CI workflow for this repo!
Any help on this would be appreciated. Otherwise I can look into it later (in a month or so).
TLDR:
screen
doesn't seem to work either within a python script (using subprocess.POpen
) or just directly called from the cmdline. If someone can figure out what this issue is then this CI should work.
@abejgonzalez The PR should be working now, thanks for the patches
@abejgonzalez The PR should be working now, thanks for the patches
If you have the time, parallelizing the fullTests.py
script would be nice
From what I can see at the moment, the parallisation of fulltest.py fails bc of the way firemarshal handles workdir. Because there's only one workdir, it's very easy to have conflicts between different jobs if they're run in parallel
You can see the changes at https://github.com/firesim/FireMarshal/tree/use-conda-fix
Are we sure that this doesn't rebuild br-base for every single job (that shares the same br-base
spec)(ditto for fedora)? I think that the current repo setup overbuilds things. Instead we should do the following:
Depending on the time I have in the next week I can take a stab at this / look into if things are properly cached.
In the interim, I'll just run the bare-metal tests (as a sanity check) while we figure out the situation w/ parallelization/extra-builds.
This adds basic CI testing FireMarshal using the existing
fullTests.py
script. This uses the same CI machine as FireSim local FPGA support for now. This can be parallelized in the future.