Pharo 9.0 Killed itself on an Array new: request

doug719 commented 3 years ago

Bug description: Running the program below a second time causes an abort/kill initiated by Pharo 9 (development), (NOT a crash). If a memory request cannot be honored, Pharo should not kill itself. Like other programming languages, it should give an error message or status return, not honor the request, and return to the user. As was pointed out, the implicit local variables in playground inhibit garbage collection (is this documented?). The error message can certainly be improved (the cause was a simple Array new: memory request that could not be satisfied).

Pharo 9.0 - 64bit (development version, latest) (downloaded yesterday). Pharo 9.0.0 Build information: Pharo-9.0.0+build.1282.sha.8795ae30a9211c261e7e09cfe3b08de4d0a8a526 (64 Bit)

using Linux Mint 20 Ryzen 5600 CPU, 32 GiB memory.

To reproduce: (reproduces every time) Did a print it on the following in the playground:

" following takes about 12 seconds and uses 18.4 GiB " len := 2000000000. ar1 := Array new: len. 1 to: len do: [ :k | ar1 at: k put: k + 1000000 ]. ar1 at: len.

It ran ok and supplied the correct answer.

If I immediately run it a second time I get:

pthread_setschedparam failed: Operation not permitted This VM uses a separate heartbeat thread to update its internal clock and handle events. For best operation, this thread should run at a higher priority, however the VM was unable to change the priority. The effect is that heavily loaded systems may experience some latency issues. If this occurs, please create the appropriate configuration file in /etc/security/limits.d/ as shown below:

cat <<END | sudo tee /etc/security/limits.d/pharo.conf

hard rtprio 2
soft rtprio 2 END

and report to the pharo mailing list whether this improves behaviour.

You will need to log out and log back in for the limits to take effect. For more information please see https://github.com/OpenSmalltalk/opensmalltalk-vm/releases/tag/r3732#linux Killed

I would have supposed that on the second run it would garbage collect and run ok.

welcome[bot] commented 3 years ago

Thanks for opening your first issue! Please check the CONTRIBUTING documents for some tips about which information should be provided. You can find information of how to do a Pull Request here: https://github.com/pharo-project/pharo/wiki/Contribute-a-fix-to-Pharo

GitHub
pharo-project/pharo
Pharo is a dynamic reflective pure object-oriented language supporting live programming inspired by Smalltalk. - pharo-project/pharo

svenvc commented 3 years ago

each array slot takes 8 bytes on a 64-bit system. your 2e9 size array thus takes 16e9 bytes. computing the sums generates additional garbage in the form of temporary integers.

the real problem is that you are using a playground feature, undeclared variables, that become locally bound automatically inside the playground, which are kept for you.

so the second time you run it, the first array is still bound to ar1, keeping it from becoming garbage, while you need the extra memory for the new allocation, before the assignment.

after the assignment the first array will become garbage.

the following uses real temporary variables:

| len ar1 |
len := 2000000000.
ar1 := Array new: len.
1 to: len do: [ :k |
ar1 at: k put: k + 1000000 ].
ar1 at: len.

I would expect that to work better.

doug719 commented 3 years ago

I tried your suggestion, explicit temporary variables, and it worked ok. But is seems like:

implicit local variables in playground should act as explicit local variables, or the difference documented.
Pharo 9 should not kill itself. Surely there is a more graceful way to give an error message and abort the current operation, not shut down Pharo completely. The error message does not seem very helpful. I don't know of other programming languages that abort/kill themselves if too much memory is requested. I think the bug report should stay open and Pharo not kill itself. I am going to revise the bug report.

Ducasse commented 3 years ago

What you should consider is that on your os you do not have the latest Pharo VM. So you are using a Pharo80 VM because we did not release a Pharo90 VM for your OS. We will release a new one. Now this is fun because I have many core dumps of C or C++ applications. I even got LLVM generating code that cannot be debugged and dissambled from valid C. So I would not be that categoric about other programming languages :).

We will have a look at your tests taking into account sven's remarks. Thanks Sven!

Now when doing a test we should control all the parameters and this is what sven is telling you. We got really a LOT improvements on the Pharo VM. It does not mean that it cannot crash but I fixed terribly vicious bugs related to assembly management or GC. In the future we will also probably start to work on three color gc. Finally we are REALLY concerned about the robustness of our system. Just for the record we wrote more than 3000 tests for the Pharo VM starting from near zero and I can tell you that such kind of tests are not the one you write simply.

doug719 commented 3 years ago

I state in the report that I was using ::

Pharo 9.0 - 64bit (development version, latest) (downloaded yesterday). Pharo 9.0.0 Build information: Pharo-9.0.0+build.1282.sha.8795ae30a9211c261e7e09cfe3b08de4d0a8a526 (64 Bit)

That does not look like Pharo80, as you state.

So I had the latest Pharo 9 development version. I would think that you would want bug reports on development versions, since you provide them.

I am not saying that you should eliminate all crashes (impossible). I am saying that Pharo 9 (development) decided to kill itself because it could not handle the memory request. Sure, C and C++ crash, but if you request too much memory from almost any programming language (including C and C++) , they do not crash or kill themselves, but instead give you a status return.

Yes, my test was simple, but it pointed out a simple situation that Pharo 9 could not handle, and it decided to kill itself.

svenvc commented 3 years ago

I think that modern 64-bit VMs consider memory as being infinite, which is why it cannot know when memory is full.

What probably happened is that the OOM killer intervened, see https://docs.memset.com/other/linux-s-oom-process-killer

There is probably some log file where this is noted.

I am pretty sure that if you allocate that much memory in a C program, malloc 16GB twice, it will end.

But I do agree that the error message could be clearer.

Linux's OOM Process Killer

doug719 commented 3 years ago

Regarding C and malloc: C uses virtual memory for dynamic memory requests. I can do 3 mallocs of 35GB each to different char pointers with no problem. If I try a single 36 GB request, malloc returns failure (a Null pointer). In using malloc, there was no abort or crash.
If the Pharo 9 development system that I have does not have the latest VM, which distribution does? Have you tried the test case on a Pharo 9 with the latest VM? I can try it on Windows 10, Ubuntu or Open Suse.

pharo-project / pharo

Pharo 9.0 Killed itself on an Array new: request #8968