sagemath / sage-windows

Build tools for the Sagemath Windows installer
348 stars 47 forks source link

Plotting kills Sagemath (on some machines) #57

Open holgerurbanek opened 3 years ago

holgerurbanek commented 3 years ago

Update: (@embray) before anyone else comments on this issue please see this comment: https://github.com/sagemath/sage-windows/issues/57#issuecomment-841135674

On 1 of my 3 Windows computers, plotting with e.g. plot(sin(x),x) results in a sudden death of sagemath. (commandline as well as jupyter-notebook) I'm using the same, current release of sagemath for windows (installer 0.6.2) SageMath version 9.2, Release Date: 2020-10-24 Using Python 3.7.7. Type "help()" for help.

The computer it fails, is a rather new Microsoft Surface Pro 7 (10th gen Intel i7), the others, where it works are a Surface Go (Pentium blah-something) and a Lenovo P50 (6th or 7th gen i7).

Could it be, that this is some CPU specific optimisations?

embray commented 3 years ago

Could it be, that this is some CPU specific optimisations?

Quite possibly. I doubt the problem is particular to plotting either. What happens if you do something like:

>>> import numpy as np
>>> np.dot([[1, 2, 3], [4, 5, 6], [7, 8, 9]], [10, 11, 12])

Please also provide the output of cat /proc/cpuinfo on the affected machine.

holgerurbanek commented 3 years ago

Numpy works as expected.

The output on the console for the failing plot-command:

$ /bin/bash --login -c '/opt/sagemath-9.2/sage'
┌────────────────────────────────────────────────────────────────────┐
│ SageMath version 9.2, Release Date: 2020-10-24                     │
│ Using Python 3.7.7. Type "help()" for help.                        │
└────────────────────────────────────────────────────────────────────┘
sage: plot(sin(x))
------------------------------------------------------------------------
Unhandled SIGSEGV: A segmentation fault occurred.
This probably occurred because a *compiled* module has a bug
in it and is not properly wrapped with sig_on(), sig_off().
Python will now terminate.
------------------------------------------------------------------------
/opt/sagemath-9.2/src/bin/sage-python: line 2:    76 Segmentation fault      (core dumped) sage -python "$@"

Sorry that I did not post it earlier, but y son was using the affected machine for home schooling oO

So, the content of cpuinfo (only for one core, in total that are 8):

$ cat cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 126
model name      : Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz
stepping        : 5
microcode       : 0xA0
cpu MHz         : 1500.000
cache size      : 8192 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc art arch_perfmon rep_good nopl xtopology cpuid aperfmperf tsc_known_freq pni pclmuldq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt intel_pt avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req avx512vbmi umip pku avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid fsrm md_clear flush_l1d arch_capabilities
bogomips        : 3000.00
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:
holgerurbanek commented 3 years ago

However, mathplotlib seems to have an issue: When doing

import matplotlib.pyplot as plt
import numpy as np
x=np.arange(-3,3,.1)
y=np.sin(x)
plt.plot(x,y)

(in a notebook) the kernel dies again. As soon as i issue the plot command.

embray commented 3 years ago

Right, I suspected the problem was not specific to Sage. When there have been crashes with plotting before it's usually had to do with Numpy and/or OpenBLAS. Maybe the example I gave with np.dot wasn't quite good enough to reproduce the problem (maybe the arrays should be floating point). But usually it's occurred in plotting during some linear algebra code.

It's weird that you're getting a SIGSEGV though and not and SIGILL.

Do you have Docker on this machine, and if so can you see if sage on Docker has the same problem?

embray commented 3 years ago

Oh, and do you see if this is producing a cysignals_crash_logs/ directory anywhere?

holgerurbanek commented 3 years ago

So, more infos: I installed Docker and Sagemath on Docker. There it runs. (But wsl is a memory hog, and hard to convince to give the memory back in the end ... issuing an additional wsl --shutdown at the end). Aside from that, it runs well in there. even the plotting in a jupyter-notebook works.

Also python and jupyter-notebook from an anaconda-installation has no issues with plotting.

Then I tried to ensure, it is not numpy, by issuing np.dot(np.random.rand(1000,1000), np.random.rand(1000)) wich also works completely fine.

Where would I find the crash-logs? A updatedb followed by a locate cysignals_crash_logs in the sage-shell found nothing. Manually I could also not find something, not in the .sage direcory, and not in the install dir. Atm windows explorer is searching my whole hdd for cysignals_crash_logs, yet that may take some time -- after ~15 minutes it found -- NOTHING.

I'm afrait, this is not that much helpful.

Next step, as a workaround on that machine, will be a Ubuntu in a Virtualbox, and install it there.

But, all in all I'd really say a big THANK YOU for your work. On the other 2 computers it is running flawless. Even better when excluding the install-dir and the user/.sage dir from the windows defender.

slel commented 3 years ago

Similar reports:

embray commented 3 years ago

To be clear this has nothing specifically to do with plotting. It is more likely a bug in OpenBLAS or some other linear algebra code used by numpy+matplotlib. It just happens that plotting is the first place most users will encounter this. It would help on those Ask Sage questions if they could provide their CPU info.

embray commented 3 years ago

@holgerurbanek Thanks for trying. I wasn't sure if a cysignals crash log would even be produced for this, but it was worth a look, so thanks for checking. Does it say anything like "Saved trace to /path/to/.sage/crash_logs/crash_woj4behv.log"?

embray commented 3 years ago

Possibly related, but I'm not sure: https://trac.sagemath.org/ticket/29537#comment:25

embray commented 3 years ago

Perhaps also related: https://trac.sagemath.org/ticket/31007 and the corresponding OpenBLAS fix: https://github.com/xianyi/OpenBLAS/pull/2960

Though I'm not sure why since your CPU does have AVX2 support. (I previously wrote something along the lines of this being even more likely related, but now I have my doubts.)

embray commented 3 years ago

For what it's worth there have been several other reports of an issue like this, and it's also showing up on some of our CI builds. I suspect the issue has to do with Numpy but I still haven't found a way to reproduce the issue myself, as it's probably hardware dependent: https://trac.sagemath.org/ticket/29537#comment:43

embray commented 3 years ago

@holgerurbanek By any chance does running

>>> import numpy
>>> np.linalg.inv(np.random.rand(24, 24))

crash?

holgerurbanek commented 3 years ago

No luck here. inv working all fine.

When crashing, no indication to any logfile is given. I also found no --verbose option or alike, to enable a logfile generation with sage. Perhaps there is some hidden feature there?

Next try on my side will probably be: installing python3 and matplotlib and nympy on my cygwin-installation and testing there.

Just running sage -t --all and getting a crash on e.g.:

sage -t --random-seed=0 /opt/sagemath-9.2/src/sage/calculus/calculus.py
    Killed due to segmentation fault

Just rerunning with piping outputs into file, just takes some time ...

holgerurbanek commented 3 years ago

Here are the outputs from sage -t --all --verbose

https://www.dropbox.com/s/5ajxlp3kaa2ccs5/log.txt?dl=0

holgerurbanek commented 3 years ago

Current cygwin python3 install has no problem with matplotlib plotting. Investigating further ...

embray commented 3 years ago

@holgerurbanek Thanks for continuing to investigate. We're pretty sure the problem has to do with Numpy's new SIMD intrinsics, and that some of them are not disabled at compile-time making for incompatibility on certain architectures. I don't have access to my Windows machine right now so I haven't been able to do much with it though the problem isn't Windows-specific either.

holgerurbanek commented 3 years ago

Ok, could well be. I'd also think into that, but perhaps rather than suspecting numpy, could it be somwehere in the matplotlib? (However I'm not taht deep into matplotlib/numpy compiliation and that stuff, like you all are definitely!)

I just went trough the log-file i posted earlier, the segementation faults only happen right after the plot-commands, and not after any other calculations.

embray commented 3 years ago

No, almost certainly nothing directly to do with matplotlib.

slel commented 3 years ago

One user reports at

that on their Windows installation, this problem occurs with SageMath 9.2 installed with versions 0.6.1 or 0.6.2 of the Sage-Windows installer, but not with SageMath 9.1 installed with version 0.6.0 of the installer.

woodringct commented 3 years ago

I experienced the same crash when I first installed Sage 9.2 on Windows 10 Pro about a month ago. I never saw the plot and had to restart the kernel. I restarted the kernel a number of times, never saw the plot and crashed every time. Now today I had downloaded the sage manifolds tutorial and noticed the plots in it were generated with no problems. Thinking this was odd after the crashes on a simple plot, I went and opened up the included notebook from a month ago and without even running it the plot was there! It was not there when I saved the notebook last month. So thinking that was very odd, I ran the notebook included in the zip and low and behold it crashed. I had saved the version with the plot visible and using the checkpoint I reloaded it. The plot was there. Interestingly it has input #5 as the only input to the notebook. I do not remember from a month ago why I would have deleted the cells 1-4.

I restarted completely Sagemath and ran the notebook and it crashed. Right now I did not see the plot when I saved the file but I am going to reboot my computer and reload the file and see if it is there.

Maybe one of the Sage gurus can figure out what is going on with this info.

New Compressed (zipped) Folder.zip

woodringct commented 3 years ago

Further information on the plotting issue. Today I ran the notebook, which was in the zip file I posted, that crashed in Sagemath yesterday using my installation of Anaconda. Usinig Jupyter Lab and Jupyter notebook the plotting works without an issue. Maybe that is how I got the plot in the notebook a month ago but I didn't remember trying that. The Anaconda version of Jupyter Lab is 3.014, and Jupyter Notebook is 6.3. The Python version is 3.8.5. In Sage Python is 3.7.7 and 6.11 for Jupyter Notebook.

embray commented 3 years ago

@woodringct Thank you for the detailed reports, but please save yourself from worrying about or overthinking this too much, as it's not actually a deep mystery what's going on here. The issue has nothing directly to do with plotting but rather to do with the fact that Numpy (which is used by matplotlib among other things) was compiled with architecture-specific optimizations that crash on certain machines. If you didn't have problems with plots from the SageManifolds notebooks it's because most of them are 3D plots that aren't using Numpy. I just haven't had time (or consistent access to a Windows machine) to try building a new release. I think going back to SageMath 9.1 will "solve" this problem for most people who have it.

slel commented 3 years ago

Possibly related report at

slel commented 3 years ago

Reported again at

sophiasage commented 1 year ago

Any updates on this? I just had the same problem on Sage 9.3 for Windows. (plotting killed my kernel). Switching back to 9.1.