uber-archive / pyflame

🔥 Pyflame: A Ptracing Profiler For Python. This project is deprecated and not maintained.
Apache License 2.0
2.98k stars 240 forks source link

Segfaults with --threads in OSQP #177

Open alfa07 opened 5 years ago

alfa07 commented 5 years ago

Pyflame causes segfaults in OSQP (https://osqp.org/) when solver is run with polish=1.

Disassembly:

 0x00007f4db095b49c <+268>:   xor    %ebp,%ebp
   0x00007f4db095b49e <+270>:   mov    0x8(%rax,%r10,8),%rbx
   0x00007f4db095b4a3 <+275>:   mov    (%rax,%r10,8),%rcx
   0x00007f4db095b4a7 <+279>:   cmp    %rcx,%rbx
   0x00007f4db095b4aa <+282>:   jle    0x7f4db095b665 <QDLDL_factor+725>
   0x00007f4db095b4b0 <+288>:   mov    %r9,-0x28(%rsp)
   0x00007f4db095b4b5 <+293>:   mov    -0x18(%rsp),%r13
   0x00007f4db095b4ba <+298>:   mov    -0x8(%rsp),%r9
   0x00007f4db095b4bf <+303>:   nop
=> 0x00007f4db095b4c0 <+304>:   mov    (%r9,%rcx,8),%rax
   0x00007f4db095b4c4 <+308>:   movsd  0x0(%r13,%rcx,8),%xmm0
   0x00007f4db095b4cb <+315>:   cmp    %r10,%rax
   0x00007f4db095b4ce <+318>:   je     0x7f4db095b6c0 <QDLDL_factor+816>
   0x00007f4db095b4d4 <+324>:   lea    (%r11,%rax,1),%rdx
   0x00007f4db095b4d8 <+328>:   movsd  %xmm0,(%rsi,%rax,8)
   0x00007f4db095b4dd <+333>:   cmpb   $0x0,(%rdx)
   0x00007f4db095b4e0 <+336>:   jne    0x7f4db095b57d <QDLDL_factor+493>
   0x00007f4db095b4e6 <+342>:   movb   $0x1,(%rdx)
   0x00007f4db095b4e9 <+345>:   mov    %rax,(%r8)
   0x00007f4db095b4ec <+348>:   mov    (%r12,%rax,8),%rax

Registers:

(gdb) info registers
rax            0x3437050        54751312
rbx            0xb129   45353
rcx            0xae82   44674
rdx            0x3432b18        54733592
rsi            0x342e8b0        54716592
rdi            0x367a730        57124656
rbp            0x0      0x0
rsp            0x7ffebc323f68   0x7ffebc323f68
r8             0x3426118        54681880
r9             0x7f4dbc7cf002   139971851513858
r10            0x84f    2127
r11            0x32d9370        53318512
r12            0x34151d0        54612432
r13            0x34cc420        55362592
r14            0x3421d50        54664528
r15            0x340c8d8        54577368
rip            0x7f4db095b4c0   0x7f4db095b4c0 <QDLDL_factor+304>
eflags         0x10202  [ IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0

Operating system:

CentOS Linux release 7.7.1908 (Core)

Kernel:

Linux 3.10.0-957.12.2.el7.x86_64

Repro:

# portfolio.py
# python==2.7.16
import osqp  # osqp==0.4.1
import numpy as np  # numpy==1.14.3
import scipy as sp  # scipy==1.1.0
from scipy import sparse
import threading
import time

# Generate problem data
sp.random.seed(1)
n = 1000
k = 100
F = sparse.random(n, k, density=0.7, format='csc')
D = sparse.diags(np.random.rand(n) * np.sqrt(k), format='csc')
mu = np.random.randn(n)
gamma = 1

# OSQP data
P = sparse.block_diag([D, sparse.eye(k)], format='csc')
q = np.hstack([-mu / (2*gamma), np.zeros(k)])
A = sparse.vstack([
        sparse.hstack([F.T, -sparse.eye(k)]),
        sparse.hstack([sparse.csc_matrix(np.ones((1, n))), sparse.csc_matrix((1, k))]),
        sparse.hstack((sparse.eye(n), sparse.csc_matrix((n, k))))
    ], format='csc')
l = np.hstack([np.zeros(k), 1., np.zeros(n)])
u = np.hstack([np.zeros(k), 1., np.ones(n)])

# Create an OSQP object
prob = osqp.OSQP()
# Setup workspace
prob.setup(P, q, A, l, u,
    polish=1) #  polish=1 causes segmentation faults under pyflame

# Solve problem
for i in range(1000000):
    res = prob.solve()

Run portfolio.py:

$ ulimit -c unlimited 
$ python portfolio.py &> /dev/null

Run pyflame several times:

$ pyflame --pid=$(ps aux | grep python | grep portfolio | awk '{print $2}') -o pyflame.prof -s 5 --threads

Get a segmentation falut.

$ python portfolio.py &> /dev/null
[1]    1225 segmentation fault (core dumped)  python portfolio.py &> /dev/null
$ pyflame --pid=$(ps aux | grep python | grep portfolio | awk '{print $2}') -o pyflame.prof -s 5 --threads
Unexpected ptrace(2) exception: waitpid() indicated a WIFSTOPPED process, but got unexpected signal 11