oilshell / oil

Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
http://www.oilshell.org/
Other
2.78k stars 150 forks source link

Run ble.sh (2022) #1069

Open andychu opened 2 years ago

andychu commented 2 years ago

Continuation of #653

At the very least we should see if we can reproduce the results of running ble.sh unit tests in #762 on our new container-based continuous build

I'm also curious what functionality is missing

andychu commented 2 years ago

Just ran it again: http://travis-ci.oilshell.org/github-jobs/2022-01-07__05-34-01.wwz/_tmp/soil/logs/ble-test.txt

[section] util: 895/1011 (116 fail, 0 crash, 0 skip)
osh-0.9.6$ ^D
DONE
andychu commented 2 years ago

Hm from #762 we were at

[section] util: 913/1011 (98 fail, 0 crash, 0 skip)

Hm maybe we're on the wrong branch? I think there is a branch that is maybe not maintained

But still there is a regression

@akinomyoga Can you tell me if this osh branch is still maintained? Or should I try running the tests on master?

https://github.com/oilshell/oil/blob/master/test/ble.sh#L18

I revived this build we did in 2020! I haven't touched it since then but it still kinda works. Just curious if there have been any changes

andychu commented 2 years ago

Also continuing the thread from #257 , here is a page I drafted about help I would like to raise money to pay for

https://github.com/oilshell/oil/wiki/Compiler-Engineer-Job

Although now that I read it over it looks too detailed. The short summary is: Write a 10K line Python compiler in Python, and 10K line C++ runtime, and get a free shell :-)

(and get paid, although I'm still working on raising the funds)

I think this is a very fun project for the right person. It's fun but I think will take full time effort. (And remember it is over half done, but the code is messy because of the way we used MyPy, and probably needs a rewrite)


I made this web page with line counts, although for some reason it makes things look bigger than they are ... it is a small amount of code, at least compared with GNU bash itself:

https://www.oilshell.org/release/0.9.6/pub/metrics.wwz/line-counts/for-translation.html

akinomyoga commented 2 years ago

https://github.com/oilshell/oil/issues/257#issuecomment-1007147087 by @andychu

So, actually, I guess, after a few hooks for the user input are provided by Oil, it is not so hard for me to adjust ble.sh (and probably to submit additional small fixes to Oil) so that it works as a line editor of Oil.

Oh really that would be very interesting -- do you have an idea of the missing features? Maybe we can discuss more on #653 or start a new issue for "ble.sh 2022 :)"

I don't remember the details after two years, but basically

  1. some alternative mechanism for bind -x, which is also related to the signal handling,
  2. read -t 0 or poll/select
  3. some mechanism of eval -g (#704) or some API for the save/restore the state of options, signal handling, etc.

I think there can be several approaches to 1 and 2:


oil-native is making slow progress -- as of the latest release there are 1131 tests passing, out of 1900 or so.

Hmm, OK. If we use ble.sh with the Python osh/oil, it might be a good demonstration of osh/oil, but I have to say that the response is too slow to be used as a line editor of daily use. Also, in my vague memory, the footprint of osh with ble.sh was of the order of a few or several hundred megabytes (but it might be different).

akinomyoga commented 2 years ago

@akinomyoga Can you tell me if this osh branch is still maintained? Or should I try running the tests on master?

Not maintained, meaning that it doesn't track the changes in the latest master. I think the osh branch is basically based on ble.sh in about the middle of 2020. Actually, the latest ble.sh has many changes since May, 2020 and relies on other Bash's behavior that ble.sh didn't use on May, 2020 and Oil doesn't implement. For example, saved_func=$(declare -f func) / eval "$saved_func" #647 are now used in important places, but I guess it will require much efforts to support declare -f func in osh.

andychu commented 2 years ago

Ah OK thanks for the info. Off the top of my head I think we have read -t 0 already, which does select().

I don't think declare -f is too hard but I would have to think about it. (We might just print the source code rather than pretty-printing the AST like bash does.)

oil-native should make things faster and smaller! It should be almost as fast as bash to start, and it will use less memory, although I don't know exactly how much. And as mentioned I think we can easily optimize speed by 2x.

There is a rough idea of memory usage here:

https://www.oilshell.org/release/0.9.6/benchmarks.wwz/mycpp-examples/

e.g. on small examples the Python version uses 7.3 MB while the C++ version uses 3.4 MB. That seems like it's going in the right direction but probably not good enough. (hence why I am looking for help!)

How much memory does ble.sh use when run under bash?

I agree oil-native is the important part, and that's why I'm seeking help for it. In the meantime, if you have any feedback about Oil let me know!

andychu commented 2 years ago

Hm actually now I realize a good memory benchmark would be source ble.sh in bash and OSH, and compare memory usage. This will even work in oil-native now I think, since it just has to parse and load all the function bodies.

I suspect it will be a lot higher because Oil has a very detailed representation of the code, and it does eager parsing inside $(( )) and ${}, unlike bash. But it will be interesting to see how much. Also I'm sure it can be reduced 2x with optimization, although maybe it's 10x higher already ...

akinomyoga commented 2 years ago

Ah OK thanks for the info. Off the top of my head I think we have read -t 0 already, which does select().

Oh, OK, I have now confirmed that. I think I need to update the wiki page Running ble.sh With Oil · oilshell/oil Wiki.

I don't think declare -f is too hard but I would have to think about it.

That's nice to hear.

(We might just print the source code rather than pretty-printing the AST like bash does.)

Yeah, that is also one option.

I remember that the source of the functions in JavaScript which is obtained by .toString() is also the actual source code (instead of the one reconstructed from AST), where the code comments are also preserved. There was an interesting trick that one writes the documentation of JavaScript functions in the first comment in the function body so that it can be extracted from func.toString(). If one writes sufficient information in the comments in the function body, that could be used for some kind of reflection.

How much memory does ble.sh use when run under bash?

Actually, it depends on the Bash version. The memory use of ble.sh is larger for recent Bash. For example, the latest ble.sh uses about 50 MB in the latest Bash. Actually, this is one of the recent problems of ble.sh.

Hm actually now I realize a good memory benchmark would be source ble.sh in bash and OSH, and compare memory usage.

I have tried it.

Bash version plain ble.sh (2020-07)
only source / +module+attach
ble.sh (2022-01)
only source / +module+attach
Bash-3.2 3.6 MB 14 MB / 17 MB 15 MB / 19 MB
Bash-4.3 3.8 MB 17 MB / 29 MB 23 MB / 37 MB
Bash-5.1 4.3 MB 21 MB / 37 MB 27 MB / 47 MB
osh-0.9.6 15 MB 76 MB / n/a 90 MB / n/a

In the previous reply, I wrote a few hundred megabytes used by osh in my memory, but that was wrong. (Actually, I have slightly modified ble.sh for osh to suppress the memory use. I think I remember the footprint before that modification.) The memory increase of osh is roughly about (76 - 15) / (21 - 4.3) = 3.653x bash. If ble.sh is fully loaded and activated, I think the footprint of osh will surpass 100 MB: 15 + 3.653 * (47 - 4.3) = 171 MB with a simple extrapolation.

andychu commented 2 years ago

Great, thank you for trying it! This is very helpful. I filed #1070 to automate this -- it will help with the C++ translation, which is the focus of the upcoming year.

Actually it's better than I thought -- osh-0.9.6 creates a huge number of Python classes, which you can see with osh -n myscript.

Python objects are very large:

$ python2

class C(object):
  pass

>>> x = C()

>>> sys.getsizeof(x)
64

>>> sys.getsizeof(x.__dict__)
280

So the translation to C++ should reduce that by a lot -- a pointer will be 8 bytes, an integer will be 4 bytes, etc. It's statically laid out rather than dynamic with a __dict__.

I think we can get within 2x of bash on the first try, and then optimize it with enough effort.

andychu commented 2 years ago

Also Oil has something like the JavaScript feature you're talking about. It uses ### for doc comments:

http://www.oilshell.org/blog/2021/09/multiline.html#reminder-doc-comments-with

This works with traditional shell functions as well as Oil proc:

f() {
  ### my comment
  echo
}

pp proc f  # pretty print a table

You could use these in ble.sh right now because the syntax is compatible with bash. That is, you can use Oil as a documentation extractor if that proves useful.

Although now I see you use a ## as a prefix, and it has multiple lines. Hm let me think about this.

https://github.com/akinomyoga/ble.sh/blob/master/lib/core-complete.sh

What kind of metadata would you want to put in there?


Anyway, I welcome more feedback on Oil. I hope it will converge with ble.sh at some point, and we both agree C++ translation is important.

Also, I realized that compat_array which you implemented, to make ${a} equal to ${a[0]} should be the default (we should change it to strict_array and opt in). Even though that behavior is confusing, multiple shells agree on it (I think), and Oil has a separate array syntax that behaves more conventionally (const x = a[i], etc.).

This is #967

Also I remember you know quite a bit about C++ too, and I think the translation task could be pretty fun in that respect. Here is a listing of the generated code:

https://www.oilshell.org/release/0.9.6/pub/metrics.wwz/line-counts/oil-cpp.txt

   1366 _tmp/native-tar-test/oil-native-0.9.6/mycpp/gc_heap.h
   2321 _tmp/native-tar-test/oil-native-0.9.6/_build/cpp/runtime_asdl.cc
   4221 _tmp/native-tar-test/oil-native-0.9.6/_build/cpp/syntax_asdl.h
   9999 _tmp/native-tar-test/oil-native-0.9.6/_build/cpp/syntax_asdl.cc
  29060 _tmp/native-tar-test/oil-native-0.9.6/_devbuild/gen/osh-lex.h
  32805 _tmp/native-tar-test/oil-native-0.9.6/_build/cpp/osh_eval.cc
  93228 total

And this already passes 1131 out of 1900 spec tests! It can run those fibonacci and bubble_sort examples, etc. I think it should be able to source ble.sh, or at least the old copy. I will try that out.


Actually what is the reason for the n/a in the table? Is ble.sh now using a feature that OSH cannot source? Or was there a regression?

(I didn't look at why the unit test numbers went down, but I was a little surprised by that, since I don't know of any OSH regressions. It is very highly tested. I may dig into it more)

andychu commented 2 years ago

Thinking about the doc comments a little more, I think it wouldn't be that hard to support ## as well as ### and then add another column for pre_doc_comment in addition to post_doc_comment

osh$ f() {
> ### hi
> echo
> }
osh$ pp proc f
proc_name       doc_comment
f       hi

The strings are encoded in QSN format, so this can be parsed and used by a doc generator. I'm not sure if this is important for ble.sh, but just mentioning it in case. I guess it is probably not that hard to make a custom parser but I think it is nice if the language can do it for you.

akinomyoga commented 2 years ago

Also Oil has something like the JavaScript feature you're talking about. It uses ### for doc comments:

Oh, pp proc f looks very useful although I currently don't have a plan to use them from ble.sh itself. I just remember some use cases in JavaScript. The purpose of document comments in ble.sh (starting from ##) is purely notes for me (at least currently).

What kind of metadata would you want to put in there?

I usually put 1) what arguments the shell function accepts, 2) what variable the shell function accesses through the dynamic scoping, and 3) the description of exit status, along with the details of each data format, etc. I actually don't think the short description of the function is so important because I feel that that information should be usually represented by the function name,

Actually what is the reason for the n/a in the table? Is ble.sh now using a feature that OSH cannot source? Or was there a regression?

"n/a" for +module+activate: Yes, there are still some syntaxes that OSH fails to parse in modules. It's not a regression because I was actually aware back then. These were related to or relying on the module for the syntax analysis that is designed for Bash syntax but not for Osh syntax, so I didn't make much effort to find out the causes and to report it here at that time.

"n/a" for ble.sh (2022-01): Just because I haven't tried it yet. First of all, ble.sh cannot be loaded in osh without modifications and there were already many modifications to the osh branch of ble.sh in 2020-07. I first need to rebase these changes to the latest version of ble.sh. In addition, the codebase is largely changed so I guess there will be again many new problems (just as in #653). I did know that there are still many errors in parsing other modules of ble.sh (including syntax analysis), so I don't think all the incompatibility of osh is solved by the discussion in #653. I haven't tried it, but I anticipate that we need efforts to make the latest version of ble.sh loadable in osh.

(I didn't look at why the unit test numbers went down, but I was a little surprised by that, since I don't know of any OSH regressions. It is very highly tested. I may dig into it more)

I'm not sure what you mean by regression, but any behavioral changes could affect the results. It's not surprising to me that the changes to osh behavior affect the result of tests.

Thinking about the doc comments a little more, I think it wouldn't be that hard to support ## as well as ### and then add another column for pre_doc_comment in addition to post_doc_comment

I actually don't use the document comments from inside the script. Maybe I do some static analysis in the future, but I don't think I will do it by the script but would rather write a program in C++ or other languages.

akinomyoga commented 2 years ago

ble.sh (2022-01) for osh

I have finished the rebasing of the branch osh on the latest master.

Test results

Test summary in osh-0.9.6. The full log is here: blesh-test.txt [ Note: this text file contains escape sequences for color codes, so I recommend opening it with less -r in a terminal ].

[section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
[section] ble/util: 985/1193 (208 fail, 0 crash, 0 skip)
[section] ble/canvas/trace (relative:confine:measure-bbox): 0/0 (0 fail, 0 crash, 17 skip)
[section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)

For comparison, this is the expected result generated by Bash:

[section] ble/main: 19/19 (0 fail, 0 crash, 0 skip)
[section] ble/util: 1193/1193 (0 fail, 0 crash, 0 skip)
[section] ble/canvas/trace (relative:confine:measure-bbox): 17/17 (0 fail, 0 crash, 0 skip)
[section] ble/canvas/trace (cfuncs): 18/18 (0 fail, 0 crash, 0 skip)
[section] ble/canvas/trace (justify): 24/24 (0 fail, 0 crash, 0 skip)
[section] ble/canvas/trace-text: 11/11 (0 fail, 0 crash, 0 skip)
[section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
[section] ble/unicode/GraphemeCluster/c2break: 72/72 (0 fail, 0 crash, 0 skip)
[section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 3251/3251 (0 fail, 0 crash, 0 skip)
[section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)
andychu commented 2 years ago

Great thank you for testing it! It's good to see some progress with 985/1193 vs. 895/1011.

I'm not sure what happened with read -e, but I think we can easily implement it.

90 MB doesn't seem too bad considering how large the Python objects are. I'm looking forward to testing out oil-native in #1070. I'll let you know when we do that.


Although I also wonder what method you use to get 3.8 MB "plain" for bash 4.3 ?

I'm trying to compare plain bash vs. OSH here too. The way I do it is $sh -c 'sleep 0.001; cat /proc/$$/status' and then look at VmRSS and VmPeak.

https://github.com/oilshell/oil/blob/master/benchmarks/vm-baseline.sh#L39

Actually bash's peak seems to be higher than OSH which seems wrong because of the Python issue, but I can't see any bug in the benchmark. (This chart doesn't include oil-native.)

https://www.oilshell.org/release/0.9.0/benchmarks.wwz/vm-baseline/

This seems reasonable but maybe there is a better way, and a way to make our measurements more consistent.

andychu commented 2 years ago

Wow, I found the bug in the benchmark !!!

https://github.com/oilshell/benchmark-data/tree/master/vm-baseline/spring.2021-12-29__22-16-29

Notice that the process name is actually cat for OSH and zsh. But not for bash. This is because I optimized OSH to exec the last process, instead of fork + exec.

(Shells do this to varying degrees; I noticed yash is very optimal in this regard)

If I run this, it shows that bash creates 3 different processes for that command.

sh=bash; strace -o _tmp/$sh -ff -- $sh -c 'sleep 0.1; cat /proc/$$/status'; ls _tmp/$sh.*

But I you run that with sh=zsh or sh=osh, then only 2 processes are created. So "lose" the shell process image and are instead measuring the cat process image, which has the same PID as the shell!

OK I will fix this ...

andychu commented 2 years ago

Yup in this old release before OSH was optimized, you can see the memory is correctly measured ... (but zsh is still incorrectly measured, since it is optimized)

https://www.oilshell.org/release/0.7.0/benchmarks.wwz/vm-baseline/

Compared to the latest one:

https://www.oilshell.org/release/0.9.6/benchmarks.wwz/vm-baseline/

akinomyoga commented 2 years ago

38. Regression ${var@a} with compat_array

I'm not sure what happened with read -e, but I think we can easily implement it.

Oh, sorry for my confusing writing. read -e is actually not about builtin read but a function read implemented by ble.sh which emulates the behavior of read -e of Readline.

In osh-0.8.3 0.8.pre5, the emulation of read -e by ble.sh (2020-07) worked. In osh-0.9.6, the emulation of read -e by ble.sh (both 2020-07 and 2022-01 versions) fails in its initialization stage.

I've checked what causes the regression. I'm not sure if this is the only regression, but at least ${var@a} introduced in #690 now doesn't work when compat_array is turned on. See the following example

$ osh
osh-0.9.6$ declare -A alpha=(['1']=2)
osh-0.9.6$ echo ${alpha@a}
A
osh-0.9.6$ shopt -s compat_array
osh-0.9.6$ echo ${alpha@a}

osh-0.9.6$

The way that I used to measure the footprint

Although I also wonder what method you use to get 3.8 MB "plain" for bash 4.3 ?

I'm measuring it for an interactive Bash process, which means that 3.8 MB also includes the memory used by GNU Readline which is not loaded in non-interactive Bash processes. Since the history size also affects the footprint, I specify an empty file for the history file. I'm looking at RSS because that is what I care about when I have many interactive Bash processes in a terminal multiplexer. In this regard, I haven't cared about an instantaneous peak of the memory use.

$ HISTFILE=empty.txt bash-4.3 --norc
$ ps -o rss $$
  RSS
 3760
andychu commented 2 years ago

Ah thank you, I just fixed that bug! It was introduced during a refactoring.

Thanks for your help -- I think having the ble.sh tests run will be useful, and the memory benchmark as mentioned.

I'm working on publicizing this "compiler engineer" position to accelerate the project:

https://github.com/oilshell/oil/wiki/Compiler-Engineer-Job

I spent a lot of time debugging the C++ runtime, and it's about half done, but needs more hands on it !

andychu commented 2 years ago

Just making a note about running ble tests in our CI:

ble.sh: Insane environment: $USER is empty.
ble.sh: modified USER=uke
  ble/function#suppress-stderr:ble/util/is-stdin-ready
  ^~~~~~~~~~~~
[ eval word at line 4022 of 'out/ble.osh' ]:1: 'ble/function#suppress-stderr:ble/util/is-stdin-ready' not found
  exec 30>&- 30< /dev/tty
             ^~~
[ eval word at line 4403 of 'out/ble.osh' ]:1: Can't open '/dev/tty': No such device or address
Error closing descriptor 30: Bad file descriptor
osh I/O error: Bad file descriptor

It seems like there is a problem with our container; it doesn't like accessing /dev/tty as a non-root user? Not sure how that is usually set up.

$ soil/images.sh cmd ovm-tarball bash -c 'exec 30>&- 30< /dev/tty; echo hi'
bash: /dev/tty: No such device or address
hi

On my host machine, it's not in the root group? It's in the tty group?

$ soil/images.sh cmd ovm-tarball bash -c 'ls -l /dev/tty'
crw-rw-rw- 1 root root 5, 0 Jan 20 07:21 /dev/tty
andy@lenny ~/git/oilshell/oil (dev/andy-31)$ ls -l /dev/tty
crw-rw-rw- 1 root tty 5, 0 Jan 20 02:18 /dev/tty
akinomyoga commented 2 years ago
  ble/function#suppress-stderr:ble/util/is-stdin-ready
  ^~~~~~~~~~~~
[ eval word at line 4022 of 'out/ble.osh' ]:1: 'ble/function#suppress-stderr:ble/util/is-stdin-ready' not found

This error is caused by declare -f not outputting the definition of the function. There are many other instances of the codes using declare -f in the current ble.sh. I don't want to write workarounds for every place that uses declare -f. Actually, currently I'm waiting for declare -f being implemented.


  exec 30>&- 30< /dev/tty
             ^~~
[ eval word at line 4403 of 'out/ble.osh' ]:1: Can't open '/dev/tty': No such device or address
Error closing descriptor 30: Bad file descriptor
osh I/O error: Bad file descriptor

It seems like there is a problem with our container; it doesn't like accessing /dev/tty as a non-root user? Not sure how that is usually set up.

I don't think the permission of /dev/tty is related. /dev/tty is only available when a TTY/PTY is allocated to the process-group session. I don't know how the CI tests of oil are run, but if it is performed in a Docker container, it seems there is an option -t for docker run or docker exec.

Name, shorthand Default Description
--tty-t Allocate a pseudo-TTY
andychu commented 2 years ago

Oh great point, thank you! I am just starting to use Docker/podman.

With the -t flag, the test run, and it looks like this, which looks about right (although I wonder why ble/decode says -1 skip):

    [section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
    [section] ble/util: 999/1193 (194 fail, 0 crash, 0 skip)
    [section] ble/canvas/trace (relative:confine:measure-bbox): 0/5 (5 fail,
    0 crash, 12 skip)
    [section] ble/canvas/trace (cfuncs): 0/0 (0 fail, 0 crash, 18 skip)
    [section] ble/decode: 33/33 (0 fail, 0 crash, -1 skip)
akinomyoga commented 1 year ago

I decided to assign a sequential number continuing from #653 (I renumbered the above regression for ${var@a}). I think I might be going to raise many issues again, but they can easily be forgotten if I directly write them in the threads or new conversations of Zulip. It is easier for me to manage the list here. If you would like to discuss them on Zulip, we can leave the link to the Zulip conversation in this issue.

39. NYI: EPOCHREALTIME, SECONDS, and other dynamic variables

ble.sh uses it. I believe it is not so difficult to support. We can add the new if branch in Mem.GetValue (core/state.py). There are also EPOCHSECONDS, RANDOM, and SRANDOM.

The implementation of RANDOM seemed to be once attempted in #253 (with also a mention of SECONDS), but I don't think the approach is so correct. The approach #253 seems to add a new grammar for $RANDOM just like $*, $#, etc., but the parsing rule for these special variable names is exactly the same as the normal variables. There are already existing dynamic variables such as FUNCNAME, so we should consider implementing them the same way as FUNCNAME, etc.

akinomyoga commented 1 year ago

40. BUG: a=(declare v); "${a[@]}" fails

An associative array cannot be initialized by "${a[@]}", where a contains the command to initialize it.

$ bash -c 'a=(declare -A assoc); "${a[@]}"; declare -p assoc'
declare -A assoc
$ osh -c 'a=(declare -A assoc); "${a[@]}"; declare -p assoc'
  a=(declare -A assoc); "${a[@]}"; declare -p assoc
                        ^
[ -c flag ]:1: Can't run assignment builtin recursively

It doesn't even work for a scalar declaration.

$ bash -c 'a=(declare v); "${a[@]}"; declare -p v'
declare -- v
$ osh -c 'a=(declare v); "${a[@]}"; declare -p v'
  a=(declare v); "${a[@]}"; declare -p v
                 ^
[ -c flag ]:1: Can't run assignment builtin recursively

Preferably, it should also work for array assignments, etc.

$ bash -c 'a=(declare -a "arr=(1 2 3)"); "${a[@]}"; declare -p arr'
declare -a arr=([0]="1" [1]="2" [2]="3")
akinomyoga commented 1 year ago

41. BUG: [[ -c /dev/null ]] fails in C++ version

As I have already written in the Zulip conversation at the following link, it fails with an assertion error. https://oilshell.zulipchat.com/#narrow/stream/121539-oil-dev/topic/ble.2Esh.20Testing

$ osh -c '[[ -c /dev/zero ]]'
osh: cpp/osh.cc:129: bool bool_stat::DoUnaryOp(id_kind_asdl::Id_t, Str*): Assertion `false' failed.

With some workarounds, I could run the tests. I attach the full log: test-util.txt. The test summary reads

 84.2% [section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
 85.9% [section] ble/util: 1056/1228 (172 fail, 0 crash, 6 skip)
100.0% [section] ble/canvas/trace (relative:confine:measure-bbox): 5/5 (0 fail, 0 crash, 12 skip)
100.0% [section] ble/canvas/trace (cfuncs): 13/13 (0 fail, 0 crash, 5 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)

The expected results with Bash are as follows. Some of the tests for ble/canvas are not even run above:

$ bash --rcfile bashrc.test-util
100.0% [section] ble/main: 19/19 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/util: 1228/1228 (0 fail, 0 crash, 6 skip)
100.0% [section] ble/canvas/trace (relative:confine:measure-bbox): 5/5 (0 fail, 0 crash, 12 skip)
100.0% [section] ble/canvas/trace (cfuncs): 17/17 (0 fail, 0 crash, 1 skip)
100.0% [section] ble/canvas/trace (justify): 2/2 (0 fail, 0 crash, 28 skip)
100.0% [section] ble/canvas/trace-text: 11/11 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/unicode/GraphemeCluster/c2break: 77/77 (0 fail, 0 crash, 0 skip)
---.-% [section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 0/0 (0 fail, 0 crash, 3251 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)
akinomyoga commented 1 year ago

42. BUG: osh -c 'read -d :' fails in the C++ osh (not in the Python osh)

# With the Python version

$ osh -c 'read -d :'
^C        <--- start to read stdio. I've killed it by C-c (SIGINT)

# With the C++ version

$ osh -c 'read -d :'
osh: /home/murase/.mwg/git/oilshell/oil/cpp/core.h:100: pyos::TermState::TermState(int, int): Assertion `0' failed.

This is likely to be another translation issue.

43. BUG: shopt -u expand_aliases fails in scripts sourced with arguments

The problem doesn't arise when the script is sourced without arguments. It happens only when the script is sourced with at least one argument.

$ cat test-t4-s1.sh
#!/usr/bin/env bash

shopt -u expand_aliases
$ cat test-t4.sh
#!/usr/bin/env bash

echo '<A>'
source test-t4-s1.sh
echo '</A>'

echo '<B>'
source test-t4-s1.sh
echo '</B>'

echo '<C>'
source test-t4-s1.sh a
echo '</C>'

echo '<D>'
shopt -u expand_aliases
echo '</D>'

echo '<E>'
set -- a
shopt -u expand_aliases
echo '</E>'

$ osh test-t4.sh
<A>
</A>
<B>
</B>
<C>
  shopt -u expand_aliases
  ^~~~~
test-t4-s1.sh:3: fatal: Syntax options must be set at the top level (outside any function)
</C>
<D>
</D>
<E>
</E>

I'm not sure what is the expected behavior since osh seems to try to change the behavior of Bash here, but at least the current behavior doesn't match with the error message, and it seems also inconsistent that the behavior changes depending on the number of arguments.


The ble.sh implementation of read -e (i.e. ble/builtin/read -e) still doesn't work.

Full tests running

Instead, the full tests are now running. The full tests include the tests for the Bash-parser module and the completion module. Actually, three years ago, osh couldn't parse the Bash-parser module, but I didn't report the problems because the "Bash"-parser module doesn't seem to be useful in Osh as not being fully compatible with the "Osh/Oil" syntax. The related tests were also excluded then. Now I tried to run the full tests and realized that osh can now parse the two modules (17k LoC in total) without any problems and workarounds! This is the test summary.

# Result of osh out/ble.osh --test

 84.2% [section] ble/main: 16/19 (3 fail, 0 crash, 0 skip)
 89.3% [section] ble/util: 1078/1206 (128 fail, 0 crash, 28 skip)
 94.1% [section] ble/canvas/trace (relative:confine:measure-bbox): 16/17 (1 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace (cfuncs): 18/18 (0 fail, 0 crash, 0 skip)
 93.5% [section] ble/canvas/trace (justify): 29/31 (2 fail, 0 crash, -1 skip)
 36.3% [section] ble/canvas/trace-text: 4/11 (7 fail, 0 crash, 0 skip)
100.0% [section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
 71.4% [section] ble/unicode/GraphemeCluster/c2break: 55/77 (22 fail, 0 crash, 0 skip)
 76.3% [section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 2481/3251 (770 fail, 0 crash, 0 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/edit: 2/2 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/syntax: 22/22 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/complete: 7/7 (0 fail, 0 crash, 0 skip)

For comparison, here is the results within Bash:

# Result of bash out/ble.sh --test

100.0% [section] ble/main: 19/19 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/util: 1228/1228 (0 fail, 0 crash, 6 skip)
100.0% [section] ble/canvas/trace (relative:confine:measure-bbox): 17/17 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace (cfuncs): 18/18 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace (justify): 30/30 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/canvas/trace-text: 11/11 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/textmap#update: 5/5 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/unicode/GraphemeCluster/c2break: 77/77 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/unicode/GraphemeCluster/c2break (GraphemeBreakTest.txt): 3251/3251 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/decode: 33/33 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/edit: 2/2 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/syntax: 22/22 (0 fail, 0 crash, 0 skip)
100.0% [section] ble/complete: 7/7 (0 fail, 0 crash, 0 skip)

Speed: 30-40x slower than Bash

I run those tests on a virtual machine of Ubuntu 20.04 LTS, where 2 GB RAM is assigned. I noticed that the C++ version of osh is still slower than Bash by an order of magnitude. While the tests take 12 sec in Bash, the same tests take 8 min in the C++ version of osh. It is forty times slower. The time roughly doubles in the Python version of osh. This means that even though the C++ version reduces the execution time by half compared to the Python version, it is still much slower than Bash.

The speed of the C++ version of osh seems to be much more similar to the Python version of osh than Bash. Though the optimization of the GC seems to be undergoing, is what makes this large performance degradation known? I'm not sure how the translation to C++ is designed, nor I haven't checked the generated C++ codes, but I hardly think it is just an issue of a GC. Or in other words, it is unlikely that the GC occupies 97% (= 39/40) of the total execution time. This is just my random thinking without looking at any codes, but I suspect that the translated version succeeds some aspects of the object/execution model of Python, e.g., an object is generated for every single operation, etc.

Footprint: 10x larger than Bash

I also checked the memory footprints with oshrc.test-util, though I haven't measured it for the full tests, Bash uses about 21MB but the C++ version of osh uses about 230MB (with cache on disk for some data). When there is no cache on disk, ble.sh runs an additional initialization where osh consumes an extra 90MB (i.e., about 320MB in total). On the other hand, Bash doesn't seem to change the footprint by the additional initialization.

As osh seems to use a GC, the footprint difference could be understandable because I can imagine that the GC wouldn't collect the garbages at all as far as there is sufficient free space.

andychu commented 1 year ago

OK wow, this is very useful, thanks for all the testing! I will file separate bugs for many of these and post updates here


On performance, it doesn't match what we're seeing, but I guess it might not be surprising if ble.sh hits some pathological case in OSH

One issue is that we don't have real dict lookups yet -- dictionaries do linear searches. These are dictionaries at the OSH INTERPRETER level, not just bash assoc arrays

It seems like we should be able to handle the equivalent of a 21 MB heap in bash with our GC, so something is wrong


Here are some benchmarks:

On I/O bound workloads like configure we're seeing we take 1.0x to 1.5x the time, and 1.0x to 2.2x the memory

https://www.oilshell.org/release/0.15.0/benchmarks.wwz/osh-runtime/

On CPU bound workloads we're seeing 2-3x slower, e.g. Fibonacci runs in 109ms with bash, and 234 ms with OSH

https://www.oilshell.org/release/0.15.0/benchmarks.wwz/compute/


So I don't know why there is a big discrepancy, but we have many tools for profiling, so we can certainly figure it out

I'll follow up on each of these issues, any help is appreciated :) Again this is very useful, thank you!

andychu commented 1 year ago

FYI I ran ble.sh with OSH at head, with some optimizations I mentioned, following the instructions on the wiki

https://github.com/oilshell/oil/wiki/Running-ble.sh-With-Oil#try-read--e-by-blesh

And I get the 33/33 decode successes, although many other failures.

Let me try with the other commands you posted

$ time ../oil/_bin/cxx-opt/osh --rcfile oshrc.test-util 

...

[ var ? at line 1 of [ array place in [ eval word at line 2833 of 'out/ble.osh' ] ] ]:1: Unexpected token while parsing arithmetic: '.'
  x/home/andy/git/oilshell/ble.osh/out/contrib
                              ^
[ var ? at line 1 of [ array place in [ eval word at line 2833 of 'out/ble.osh' ] ] ]:1: fatal: Parse error in recursive arithmetic
  x/home/andy/git/oilshell/ble.osh/out/lib
                              ^
[ var ? at line 1 of [ array place in [ eval word at line 2833 of 'out/ble.osh' ] ] ]:1: Unexpected token while parsing arithmetic: '.'
  x/home/andy/git/oilshell/ble.osh/out/lib
                              ^
[ var ? at line 1 of [ array place in [ eval word at line 2833 of 'out/ble.osh' ] ] ]:1: fatal: Parse error in recursive arithmetic
[section] ble/canvas/trace (cfuncs): 0/0 (0 fail, 0 crash, 18 skip)
[section] ble/decode: 33/33 (0 fail, 0 crash, -1 skip)

real    0m34.964s
user    0m10.590s
sys     0m4.266s
andychu commented 1 year ago

While the tests take 12 sec in Bash, the same tests take 8 min in the C++ version of osh. It is forty times slower

Actually I ran this same command

time ../oil/_bin/cxx-opt/osh out/ble.osh --test

in 36 seconds?

This is with HEAD, where I did optimize something.

but I highly doubt that Oils 0.15.0 was that much slower? Let me try it and see

One suspicion I had was that the tests were sometimes running the Python version of OSH? Sometimes if you exec the shell again, things get confused. We had that bug in our benchmarks once



andy@lenny:~/git/oilshell/ble.osh$ time ../oil/_bin/cxx-opt/osh out/ble.osh --test

[ var ? at line 1 of [ array place in [ eval word at line 2833 of 'out/ble.osh' ] ] ]:1: Unexpected token while parsing arithmetic: '.'
  x/home/andy/git/oilshell/ble.osh/out/contrib
                              ^
[ var ? at line 1 of [ array place in [ eval word at line 2833 of 'out/ble.osh' ] ] ]:1: fatal: Parse error in recursive arithmetic
  x/home/andy/git/oilshell/ble.osh/out/lib
                              ^
[ var ? at line 1 of [ array place in [ eval word at line 2833 of 'out/ble.osh' ] ] ]:1: Unexpected token while parsing arithmetic: '.'
  x/home/andy/git/oilshell/ble.osh/out/lib
                              ^
[ var ? at line 1 of [ array place in [ eval word at line 2833 of 'out/ble.osh' ] ] ]:1: fatal: Parse error in recursive arithmetic
[section] ble/canvas/trace (cfuncs): 0/0 (0 fail, 0 crash, 18 skip)
[section] ble/decode: 33/33 (0 fail, 0 crash, -1 skip)

real    0m36.721s
user    0m12.457s
sys     0m4.121s
akinomyoga commented 1 year ago

One issue is that we don't have real dict lookups yet -- dictionaries do linear searches. These are dictionaries at the OSH INTERPRETER level, not just bash assoc arrays

Ah, this could be the reason. If I correctly understand the above, are they the dictionaries for the variable namespace and the function namespace? Then, ble.sh defines about 5000 global variables and 3000 functions. If the name lookup for the variables and functions are performed by the linear search, it would extremely slow down the execution.

For the footprint, I initially thought about the memory use of the runtime objects, but another possibility is the size of the program on the memory. I think you were talking about something called lossless syntax tree. If that retains the full information about the original source code, it might also require a larger footprint.

FYI I ran ble.sh with OSH at head, with some optimizations I mentioned, following the instructions on the wiki

https://github.com/oilshell/oil/wiki/Running-ble.sh-With-Oil#try-read--e-by-blesh

And I get the 33/33 decode successes, although many other failures.

Let me try with the other commands you posted

Ah, sorry, I haven't yet pushed the patched version. I'll push it.

andychu commented 1 year ago

Oh yes I forgot that we can't run the tests with 0.15.0

$ time ~/src/oils-for-unix-0.15.0/_bin/cxx-opt-sh/osh  out/ble.osh --test
  ble/function#suppress-stderr:ble/util/is-stdin-ready
  ^~~~~~~~~~~~
[ eval word at line 4022 of 'out/ble.osh' ]:1: 'ble/function#suppress-stderr:ble/util/is-stdin-ready' not found
osh: cpp/osh.cc:129: bool bool_stat::DoUnaryOp(id_kind_asdl::Id_t, Str*): Assertion `false' failed.
Aborted (core dumped)

In any case I hope you can get all the runs in ~36 seconds or so now, regardless of what happened in the past :) I am not sure what happened



I mentioned on Zulip how to get a tarball from every commit

This is a good one:

http://travis-ci.oilshell.org/github-jobs/4014/

http://travis-ci.oilshell.org/github-jobs/4014/cpp-small.wwz/_release/oils-for-unix.tar

akinomyoga commented 1 year ago

I have now force-pushed the patched version of ble.sh (ble/builtin/read -e still doesn't work, which was the reason I haven't yet pushed it). Since I have force-pushed, you need to forcibly reset the branch as described in Update ble.osh.

andychu commented 1 year ago

OK I ran that one in 1 minute and 3 seconds ? Maybe we can use Zulip only, because I don't want to "drown out" all the great issues you posted above

https://oilshell.zulipchat.com/#narrow/stream/121539-oil-dev/topic/ble.2Esh.20Performance

So we can use this thread for bugs, and then use Zulip for figuring out why we are getting a performance difference

I couldn't get the same command to run with bash, it will be helpful if I can run the ble.sh suite with bash and OSH, and then put it in the Oil CI, so we can monitor results