errtrace shell option, 3rd attempt

momiji commented 5 months ago

Let's continue discussion here... There are still some tests and probably code fix regarding return and exit in err trap.

momiji commented 5 months ago

I think a summary of the state of this PR can be interesting, although I understand there are still some open topics you've raised and that are not closed yet.

Analysing the CI tests results, there's a very few issues remaining, most of them being not real, or related to other fixes we can work on.

test/runtime-errors.sh test-errexit-multiple-processes

Is due to the errors display, this can be fixed in another specific branch, I have an idea on how to fix it to have only one single error displayed, because I guess the idea is to use verbose_errexit to show only one single error, and not many of them. Currently 3 errors are displayed instead of only 1 :

  ls | false | wc -l
       ^~~~~
[ -c flag ]:1: errexit PID 354396: command.Simple failed with status 1
  ls | false | wc -l
  ^~
[ -c flag ]:1: errexit PID 354395: command.Simple failed with status 141
0
  ls | false | wc -l
       ^~~~~
[ -c flag ]:1: errexit PID 354392: command.Pipeline failed with status 1

The first because it's the original error (and the first one), the second should be hidden as it is a SIGPIPE, so we might know we're in sigpipe to prevent, or maybe always prevent sigpipe errors, don't know. The third because the result of the pipeline is in error.

The idea is while parsing the AST, to identify which parts are allowed to display errexit errors and which should not, and once the first error is displayed,, just stop displaying anything. This is the idea... Maybe add a field to the mem object, or in mutable_opts?

test/spec-py.sh ysh-all-serial

The NoForkLast change impacted the way the xtrace is displaying things, this is just the test to be updated to reflect the change.

test/spec.sh builtin-trap-bash -r 19 -v

This one is expected failing, but I think I got the reason why now : the "dbg 3" is printed on stdout, which is eaten by the pipeline of false | false | false. To verify this, just do trap 'echo dbg $LINENO > /dev/stderr' DEBUG and here comes the 3 dbg 3.

EDIT: in fact, to have the 3 dbg 3, we also need to keep it in the TrapState.ClearForSubProgram(), obviously :-) EDIT2: so we might want to identify among the 4 hooks, which ones are inherited like DEBUG, inherited if configured like errtrace, and which one survive shadowing, and then refactor the state.ctx_HideErrTrap to something more general for all hooks. EDIT3: I guess that they all behave the same as DEBUG, ERR being the only having an inherit flag. As it is also needed here, should we fix this issue (more simple issue) here too?

I guess bash and other shells are doing tricks to make the debug to go the the stdout of the pipeline and not the sub commands. Probably this another hidden issue we should also fix, as it may have other side effects not yet identified. A dedicated branch for this one too?

test/stateful.sh job-control-all -r 10

This one is the only remaining issue I think there is. I didn't started yet, as I was looking to other subjects today.

andychu commented 4 months ago

OK looks like test case 12 is failing because our line numbers are different

case: 12

[osh stdout]
Expected 'line=14\nnow with errtrace\nline=4\nline=10\nline=20\nok\n'
Got      'line=10\nnow with errtrace\nline=4\nline=10\nline=10\nok\n'

osh stdout:
line=10
now with errtrace
line=4
line=10
line=10
ok

Let me see if I can make the test insensitive to that detail

andychu commented 4 months ago

test case 10 is like this

Expected 'line=6\nsubshell\nline=12\nasync\nline=14\nok\n'
Got      'line=6\nsubshell\nline=10\nline=12\nasync\nline=14\nok\n'

osh stdout:
line=6
subshell
line=10
line=12
async
line=14
ok

I'm not sure if that's OK ... I might have to play with it

andychu commented 4 months ago

I split up spec/builtin-trap-err cases and merged master ... let's see the results

andychu commented 4 months ago

Hm I think we should do something like this -- it's a lot simpler -- just a single if statement

https://github.com/oilshell/oil/pull/2014/commits/9fe9fe1f1264718e5f630fbd54024bcbf6d7175f

But it made one of the new tests fail ... it was printing ERR7 2 times instead of 3 . So I reverted it

I think that test case could be split up -- since it's a bit hard to follow.

I guess is tickling something that my test case isn't, because mine still passes ... That's in spec/builtin-trap-err

andychu commented 4 months ago

Another reason to split it up is that we can add busybox ash, and see which ones match, like here:

http://op.oils.pub/github-jobs/7385/ovm-tarball.wwz/_tmp/spec/osh-py/builtin-trap-err.html

In this case you can only see the result vs. bash, and it's not clear which ones agree

http://op.oils.pub/github-jobs/7385/ovm-tarball.wwz/_tmp/spec/osh-py/builtin-trap-bash.html

there is significant disagreement among shells, so we try not to "just implement bash bugs"

I am not sure but I think the thing that failed when I simplified the code could be a bash quirk, not present in busybox ash?

We should try to use the simple if statement if possible, instead of another ctx_HideErrTrap

andychu commented 4 months ago

Hm I also think to avoid getting stuck, and to make things easier to review, there are least 2 changes here

add errtrace option, and fix the "function" behavior
fix the "inherit" behavior

or it could be the other way around:

add errtrace options, and fix the "inherit" behavior
and fix the "function" behavior

This are is VERY tricky, so it also helps to have separate commits , in case we need to bisect

I think it's good to do "everything at once" to explore -- that is what I often do, to make sure I have an understanding of the problem

But then I try to break up any "risky" changes separately, which is often useful for testing regressions, bisecting, etc.

_NoLastFork is also somewhat of a crosscutting issue -- I cc'd you on this thraed

https://oilshell.zulipchat.com/#narrow/stream/384942-language-design/topic/shopt.20--set.20noforklast

It affects not just the ERR trap, but also the RETURN trap. There are new bash bugs related to fork optimizations.

So that is one reason we don't "just implement bash" -- because bash has bugs and it changes substantially from version to version

We try to make an "executable spec", and in this case we can base it off bash AND busybox ash

andychu commented 4 months ago

So basically I expect that after this change we will have to look at any other issues caused by noforklast

running traps
shopt -s verbose_errexit -- sometimes we lose error messages because of the fork optimizations
I ran into a problem with Oils instrumentation - we are trying to track all commands run, and maybe get GC stats per process, etc.
We might want to implement set +m to disable job control (a POSIX feature) which came up the thread on Thorsten's blog post

there were some TODOs there

https://oilshell.zulipchat.com/#narrow/stream/266575-blog-ideas/topic/Hidden.20Optimizations.20in.20Unix.20Shell.2C.20Job.20Control.2C.20zed.2Fzsh.20bug

andychu commented 4 months ago

Merged this as

https://github.com/oils-for-unix/oils/commit/a2f0ae0f1ac0acff89f8ef2016dd9cfea0681765

and

https://github.com/oils-for-unix/oils/commit/d8572e6616641955b7adb8c0cb44f28485baec90

Any more manual testing and feedback is welcome - you can download a tarball here - http://travis-ci.oilshell.org/github-jobs/7588/

Thanks for diving deep into the code!

oils-for-unix / oils

errtrace shell option, 3rd attempt #2014