intel / compute-runtime

Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
MIT License
1.12k stars 230 forks source link

Driver reset on 1804 and 6700hq #132

Closed isgursoy closed 5 years ago

isgursoy commented 5 years ago

Machine is http://www.vorke.com/project/vorke-v6/

OS is: Linux 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

I tried all releases, result is same.

dmesg:
[377999.628419] [drm] GPU HANG: ecode 9:0:0x85ddfffb, in luxmark.bin [20464], reason: Hang on rcs0, action: reset
[377999.628425] i915 0000:00:02.0: Resetting rcs0 after gpu hang
$ sudo cat /sys/class/drm/card0/error 
GPU HANG: ecode 9:0:0x85ddfffb, in luxmark.bin [20464], reason: Hang on rcs0, action: reset
Kernel: 4.15.0-45-generic
Time: 1550268925 s 731763 us
Boottime: 377998 s 780104 us
Uptime: 377996 s 217560 us
Active process (on ring render): luxmark.bin [20464], score 0
Reset count: 0
Suspend count: 0
Platform: SKYLAKE
PCI ID: 0x191b
PCI Revision: 0x06
PCI Subsystem: 1991:5594
IOMMU enabled?: 0
DMC loaded: yes
DMC fw version: 1.26
GT awake: yes
RPM wakelock: yes
PM suspended: no
EIR: 0x00000000
IER: 0x08000000
GTIER[0]: 0x01010101
GTIER[1]: 0x01010101
GTIER[2]: 0x00000070
GTIER[3]: 0x00000101
PGTBL_ER: 0x00000000
FORCEWAKE: 0xffff0001
DERRMR: 0x2077efef
CCID: 0x00000000
Missed interrupts: 0x00000000
  fence[0] = 00000000
  fence[1] = 00000000
  fence[2] = 00000000
  fence[3] = 00000000
  fence[4] = 00000000
  fence[5] = 00000000
  fence[6] = 00000000
  fence[7] = 00000000
  fence[8] = 00000000
  fence[9] = 00000000
  fence[10] = 00000000
  fence[11] = 00000000
  fence[12] = 00000000
  fence[13] = 00000000
  fence[14] = 00000000
  fence[15] = 00000000
  fence[16] = 00000000
  fence[17] = 00000000
  fence[18] = 00000000
  fence[19] = 00000000
  fence[20] = 00000000
  fence[21] = 00000000
  fence[22] = 00000000
  fence[23] = 00000000
  fence[24] = 00000000
  fence[25] = 00000000
  fence[26] = 00000000
  fence[27] = 00000000
  fence[28] = 00000000
  fence[29] = 00000000
  fence[30] = 00000000
  fence[31] = 00000000
ERROR: 0x00000000
FAULT_TLB_DATA: 0x00000009 0xa1f408a9
DONE_REG: 0x07ffffff
render command stream:
  START: 0x00009000
  HEAD:  0x00000a48 [0x00000a08]
  TAIL:  0x00000a80 [0x00000a60, 0x00000a80]
  CTL:   0x00003001
  MODE:  0x00000000
  HWS:   0xfffcf000
  ACTHD: 0x00007fe2 79750074
  IPEIR: 0x00000000
  IPEHR: 0x7a000004
  INSTDONE: 0xffddffff
  SC_INSTDONE: 0xffffffff
  SAMPLER_INSTDONE[0][0]: 0xffffffff
  SAMPLER_INSTDONE[0][1]: 0xffffffff
  SAMPLER_INSTDONE[0][2]: 0xffffffff
  ROW_INSTDONE[0][0]: 0xffffffff
  ROW_INSTDONE[0][1]: 0xfffdffff
  ROW_INSTDONE[0][2]: 0xffffffff
  batch: [0x00007fe2_7b7e7000, 0x00007fe2_7b7e9000]
  BBADDR: 0x00007fe2_79750075
  BB_STATE: 0x00000020
  INSTPS: 0x00009080
  INSTPM: 0x00000000
  FADDR: 0x00007fe2 79750200
  RC PSMI: 0x00000010
  FAULT_REG: 0x00000000
  SYNC_0: 0x00000000
  SYNC_1: 0x00000000
  SYNC_2: 0x00000000
  GFX_MODE: 0x00008000
  PDP0: 0x0000000463318000
  PDP1: 0x0000000000000000
  PDP2: 0x0000000000000000
  PDP3: 0x0000000000000000
  seqno: 0x0000000d
  last_seqno: 0x0000000e
  waiting: yes
  ring->head: 0x00000a00
  ring->tail: 0x00000a80
  hangcheck stall: yes
  hangcheck action: dead
  hangcheck action timestamp: 4389390496, 1791864 ms ago
  engine reset count: 0
  ELSP[0]:  pid 20464, ban score 0, seqno        4:0000000e, prio 0, emitted 1794104ms ago, head 00000a08, tail 00000a80
  Active context: luxmark.bin[20464] user_handle 1 hw_id 4, prio 0, ban score 0 guilty 0 active 0
blt command stream:
  START: 0x00000000
  HEAD:  0x00000000 [0x00000000]
  TAIL:  0x00000000 [0x00000000, 0x00000000]
  CTL:   0x00000000
  MODE:  0x00000200
  HWS:   0xfffc8000
  ACTHD: 0x00000000 00000000
  IPEIR: 0x00000000
  IPEHR: 0x00000000
  INSTDONE: 0xfffffffe
  BBADDR: 0x00000000_00000000
  BB_STATE: 0x00000000
  INSTPS: 0x00000001
  INSTPM: 0x00000000
  FADDR: 0x00000000 00000000
  RC PSMI: 0x00000010
  FAULT_REG: 0x00000000
  SYNC_0: 0x00000000
  SYNC_1: 0x00000000
  SYNC_2: 0x00000000
  GFX_MODE: 0x00008000
  PDP0: 0x0000000000000000
  PDP1: 0x0000000000000000
  PDP2: 0x0000000000000000
  PDP3: 0x0000000000000000
  seqno: 0x00000000
  last_seqno: 0x00000000
  waiting: no
  ring->head: 0x00000000
  ring->tail: 0x00000000
  hangcheck stall: no
  hangcheck action: idle
  hangcheck action timestamp: 4389391992, 1785880 ms ago
  engine reset count: 0
  Active context: [0] user_handle 0 hw_id 0, prio 0, ban score 0 guilty 0 active 0
bsd command stream:
  START: 0x00000000
  HEAD:  0x00000000 [0x00000000]
  TAIL:  0x00000000 [0x00000000, 0x00000000]
  CTL:   0x00000000
  MODE:  0x00000200
  HWS:   0xfffc1000
  ACTHD: 0x00000000 00000000
  IPEIR: 0x00000000
  IPEHR: 0x00000000
  INSTDONE: 0xfffffffe
  BBADDR: 0x00000000_00000000
  BB_STATE: 0x00000000
  INSTPS: 0x00000001
  INSTPM: 0x00000000
  FADDR: 0x00000000 00000000
  RC PSMI: 0x00000010
  FAULT_REG: 0x00000000
  SYNC_0: 0x00000000
  SYNC_1: 0x00000000
  SYNC_2: 0x00000000
  GFX_MODE: 0x00008000
  PDP0: 0x0000000000000000
  PDP1: 0x0000000000000000
  PDP2: 0x0000000000000000
  PDP3: 0x0000000000000000
  seqno: 0x00000000
  last_seqno: 0x00000000
  waiting: no
  ring->head: 0x00000000
  ring->tail: 0x00000000
  hangcheck stall: no
  hangcheck action: idle
  hangcheck action timestamp: 4389391992, 1785880 ms ago
  engine reset count: 0
  Active context: [0] user_handle 0 hw_id 0, prio 0, ban score 0 guilty 0 active 0
vebox command stream:
  START: 0x00000000
  HEAD:  0x00000000 [0x00000000]
  TAIL:  0x00000000 [0x00000000, 0x00000000]
  CTL:   0x00000000
  MODE:  0x00000200
  HWS:   0xfffba000
  ACTHD: 0x00000000 00000000
  IPEIR: 0x00000000
  IPEHR: 0x00000000
  INSTDONE: 0xfffffffe
  BBADDR: 0x00000000_00000000
  BB_STATE: 0x00000000
  INSTPS: 0x00000001
  INSTPM: 0x00000000
  FADDR: 0x00000000 00000000
  RC PSMI: 0x00000010
  FAULT_REG: 0x00000000
  SYNC_0: 0x00000000
  SYNC_1: 0x00000000
  SYNC_2: 0x00000000
  GFX_MODE: 0x00008000
  PDP0: 0x0000000000000000
  PDP1: 0x0000000000000000
  PDP2: 0x0000000000000000
  PDP3: 0x0000000000000000
  seqno: 0x00000000
  last_seqno: 0x00000000
  waiting: no
  ring->head: 0x00000000
  ring->tail: 0x00000000
  hangcheck stall: no
  hangcheck action: idle
  hangcheck action timestamp: 4389391992, 1785880 ms ago
  engine reset count: 0
  Active context: [0] user_handle 0 hw_id 0, prio 0, ban score 0 guilty 0 active 0
Active (rcs0) [9]:
    00007fe2_2dda1000     4096 3f 00 [ 0e 00 00 00 00 ] 00 userptr LLC
    00007fe2_4ad7f000 10240000 3f 00 [ 0e 00 00 00 00 ] 00 dirty userptr LLC
    00007fe3_7bffe000     4096 3f 00 [ 0e 00 00 00 00 ] 00 userptr LLC
    00007fe2_2ded9000    65536 3f 00 [ 0e 00 00 00 00 ] 00 userptr LLC
    00007fe3_7bfcb000    65536 3f 00 [ 0e 00 00 00 00 ] 00 userptr LLC
    00007fe2_48039000    65536 3f 00 [ 0e 00 00 00 00 ] 00 userptr LLC
    00000000_03587000     4096 3f 00 [ 0e 00 00 00 00 ] 00 userptr LLC
    00007fe2_79750000     8192 3f 00 [ 0e 00 00 00 00 ] 00 userptr LLC
    00007fe2_7b7e7000     8192 3f 00 [ 0e 00 00 00 00 ] 00 userptr LLC
Pinned (global) [26]:
    00000000_fffff000     4096 41 00 [ 00 00 00 00 00 ] 00 LLC
    00000000_ffffe000     4096 01 01 [ 00 00 00 00 00 ] 00 LLC
    00000000_fffe7000    94208 01 01 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_00001000     4096 40 40 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_fffd0000    94208 01 01 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_00002000     4096 40 40 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_fffcf000     4096 01 01 [ 00 00 00 00 00 ] 00 purgeable LLC
    00000000_fffcc000    12288 01 01 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_00003000     4096 40 40 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_fffc9000    12288 01 01 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_00004000     4096 40 40 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_fffc8000     4096 01 01 [ 00 00 00 00 00 ] 00 purgeable LLC
    00000000_fffc5000    12288 01 01 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_00005000     4096 40 40 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_fffc2000    12288 01 01 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_00006000     4096 40 40 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_fffc1000     4096 01 01 [ 00 00 00 00 00 ] 00 purgeable LLC
    00000000_fffbe000    12288 01 01 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_00007000     4096 40 40 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_fffbb000    12288 01 01 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_00008000     4096 40 40 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_fffba000     4096 01 01 [ 00 00 00 00 00 ] 00 purgeable LLC
    00000000_00040000 19816448 41 00 [ 00 00 00 00 00 ] 00 uncached
    00000000_02640000 19906560 40 00 [ 00 00 00 00 00 ] 00 uncached
    00000000_fffa3000    94208 01 01 [ 00 00 00 00 00 ] 00 dirty LLC
    00000000_00009000    16384 40 40 [ 00 00 00 00 00 ] 00 dirty LLC
rcs0 (submitted by luxmark.bin [20464], ctx 1 [4], score 0) --- gtt_offset = 0x00007fe2 7b7e7000
:=3R1Z5u^BjS#`+pfrcu#N4#`KQFU+l>8-f(D*]%-MEb':'U%Qnb:M%`6@1T2Jq3RAm&63U4U/8Z@;D_',q]N8e=`a'X)P#aG0r`C\[g!tqWhF2kJ6=qbkEp&bg(9t`[)Lm)'_;-ec`7a-^M_!@nSmLO"i*-N(pR:\5Fm"IfZ[3\MTjr@j4[UEe)>:0f?64[?$)J9oiK4'TQ('j*;rJ^e3q!6_jC7F#M7H"Y<9;S-^?+S;7RcD%'Sm#'hQ8OgnFIJ>J7[nU.E&W!hS%R@E+$`peDm#rt(WO7)3lU)W@q0H3"JE!UCf*!>+9S-L#9S-#2`>Qe:4;uk\Uqg\_L'>Re%F20TI9>CqNCVVI9?N;%\h8p6)\Vj+U8ku'%ZN6lLBp%&LjsBY0$BP6E#=o%9ULn\bat_QMI-WpcHPIS!8'A[&gj/\%*T@iWV6G<V;/5IjrP/Bl<aR.],2=\;`<BJk,-eUk;E-Skmb"t<0-2I$Z'S7iO3-+1=&m378PZ^mhd&)PFOX7h&JDR&]1SgLq=8I&/t7GiNt*W+6!Qk%!h7:$=)":9qE#%&5=2jDB9`%Jf\%o[=$C(7UZU4<YR^.7iGI2IqYZPD<R7a7?[KLAQ1MnN;I?MC@t#k`<j<EfI'di+#AVZdNDmbD^OsZ=RZ[=M(jt'Lq4U1QOS!*/g0V$@8H2pH`;[IG#sN-)*)aL4,CZP'S4<(!\K"AbjRr.0V+'JNp;D7U@,I>5D@t6#+hu+UVl``i6LV+L?bC<aqTTYQleV_9@2e\4IUuojpuYmPbP42Y2LUjI5?=]Xf1Yq[NTtfL]BRfC;Zb3Vpd]ZPL$%^/D>X5Kb(c6t#gRmP8Rpr+1o.7lS)pA*IgO/\FXLWuO^1m5VJYfA*LZ>D[o'eTXs\L:Tg*TcgT-PT#A?>E4;[DTmp2^Yo$cNV'MlCP4J1uZ5A&]lCA@5%qoNr(IA/O<W13SXe(Lg4mf).b0A\hTg-!gb2g]@fEHr@J\D$<Be@mHeq=.*iPri@.X[C]qpe'ei/c)'F3q)8$rDoQIbD@;2*Y6j4FE>?uGI,Tr[/@=g"0*t"-*I3dOju79$;+/Z==p86`&8'KHAqPf+4iU=9$Gu7TilSJ5F\4ua=cl3>])K+DmoVfh-.*4c<9HRN:lCi(>k\D><.0*F80%QYM#df[ePBh0,FN2?a'>_3l'djXQObXD**kcX6&(.jC]hqFlZ8.pDN1%nW2"UDQjd('s4J2-?]q_X:d^nEdbqkVsM?u2UVQp@HL'pWJt%W!!!E*=JYus@2)70+&?2Sda(P6khr4/>B$<&Qbt<=K"+X*:l6adUGt?PUpB>8-rn?iT!##g$I/pYn@1du,H)h1:ag_8i/"t%+%n%.W5t(oJ,aJJDf(upK:gIjRG!s;s-^UsIM[L+hD(TtP3rEOB%S5ZS`UlBp.FlWB$N"NX)R`Q>`ZP^_>7hb]X?(VGe\]TGf7:/:jj%Wb(l?kSXM40s1O9%Qj-'t/=42.KOD-XZ?KUEYCQ&HQO1b'T/q8)=,brukOUA+oD&bTfO&I&k=dpT7\ZfK._%*XCCaCfKPI@@([m6*c)Rr2?g1,A,k+TO00(l#V!#e?rqn#(DBDK6oVUnV\#+/oHO%J)n(TLO.ae9;hRU?]ocY8lN=)9&/3BTXG^r>6P6@m->gF'@+enK;Y0'Q&=)IZi,@g)bID&58?:V%*l5mL5WsAG*r_YV7Ag^)8QgtY%o@u,l9ir5%ZAe88.t8^iCu_&"K>gqFYkY?uB(W"l`Vrhm\+1X=hQ"cYj6'2L4569"mRaWN21>:-i-Yd/,uSDpIO0X6lhM&h,9,54`tuGVoK=Wr[gVsAGZOOe([up*Zhm-p"8rVEJ%aA2qA4kJ=4B-42Fl7TK(9<=A>Y+\eJ[c-P:(,;T-Z^1j3,QRkKT]H\H[7WpIk5gcQ'.%?e!&!%oPp6qM*)eaSLY6]QJZ-N_"$?D]#G8l&2#P:N$^%W,E!dM%'3Xq#ffHp$0:Y1'[^<IE2fg&\Aip6shZThZ;>RQ&Xbe_/kOrduNN?5l'O7+:li5^Ns5hA%Z:6r?*aXeH0,d0pCADhM1m1Er"+.c^Zo__RAq_6e='lXa^+1jEO"`m#\E1lA:Z&N[o>,HjW=3&>.eYOY0qRBuh+-_^YLL!T07HO%=]m))l$6:0l.@Aka`F_sKQ0BYP6IgYs#>3?_h;^Ya_U)+aYgT;WRo)+!^42a\\@$%BrFnL*b;#dojF>n?94nO$pFYF4llT08g>X5k<OHrc<srI9E*oA]C))r\E5940EqW+3@\(5qtp(Xr3?jnPNgGPg`\WB\-Z/:I?khg0#nfI-QM47%r"=F2MaI)%CC?E[d4Q1B\o^+mXR5BtH%`:rq-*?Me9WQ2peG<^=fb5Po#d(M6=^e<U"VL@BI.9Oc@.Qrihja'jIafXU^e%Tqo>9'Q"d/$WeVntFBGg.bjO3KjTIsVKG!!!9'!!E9$%;ii]
rcs0 --- 1 requests
  pid 20464, ban score 0, seqno        4:0000000e, prio 0, emitted 1794104ms ago, head 00000a08, tail 00000a80
rcs0 --- 2 waiters
 seqno 0x0000000e for luxmark.bin [20506]
 seqno 0x0000000e for luxmark.bin [20465]
rcs0 --- ringbuffer = 0x00000000 00009000
:e*NTL;+!l-oC9j8+91rp(app^(becm'a"j[$Odl,"bh[=(gt14I2*<,_1KP@-no&5"bmF&@)5-]"q2>&_26!\M(i%>?_,Ls#Pc'Qh!ZiJGFWsJZ/,B83K[S@m9HI_N[(>A?^E6Lg-n*tn@]\a<>Wis[F1R8fukqXVCeWRlK-D8S?B*OOTq#;HgeX^bH*G4q/hD9r;EP4BPaXhjmJseh`8FkN_c7ZN6N80lH(h#5=4(S_!DYl)mJreUCXVV*&3N,i/Te8M(g8)1r>Vi:'BU;E+TgD!HD`V%V!Y1o+c4G\EemZNOW2D?,0Q(4FP-&`]PKa1W;Hi]F2bB_7XR%*T8PRs3_gSpR!JeC>C?h$[T!U+'J&.@iqsF*-$co?qaG6KeUO[$-+$"bULK#im%tj`L-K'.c@HL^C>2F'"Uj87h`WZesGW>q0h*CqE*$Lo2RPneetqZY&<adlo]3l.NXlq/Xt,cC8(Gsc</]\fB4`0Z`hYX3H&8pIWE9>W\'_#3Fq$>IWF,Vf)4Wf/C2WAr@MC%Ms\8A$A+rKWh?:?!!&'am/R(c!!r\CTR["B&c8d3!!!(X!!$BAm/R(c!!r\CTR["B&c8d3!!!(X!!$BAm/R(c!!r\CTR["B&c8d3!!!(X!!$BA"98B$CPii,s3pZI
rcs0 --- HW Status = 0x00000000 fffcf000
:c0UsF+92a''gMj0s8R:E*#o\k*Q^?r9dPsUg[?I<5=OjMV=3?-!!(Y7%Gu[9rr<$!7fWYks(;>+
rcs0 --- HW context = 0x00000000 fffa3000
:_<d\:!!!H/pRoA'#7kZI!!!!5!!$p+UAt5n67GnK4)=7tf4V2#S&t:GdK^IJ'O[\A0N!F\.)A@[@&!lB;1a`@_+"bc!!#84rr<$!5M+-Z**4(mjIbtXWXa,dW_+W`NcJmQh.]iR&HluP"9hUie8E5_)@]VS98[I(_o**lTZB556lr"Q%pEB2ON:TH*Y/t$_310H*=dfkT^ZJSBJLX+d$,Yl?Zbc&-iPV&p-\+h^#&_NLNN:j87,0J79?gA,iaeXOu%)0b%aRGBd0U;R1P_87X\@tY5*)qM,ro'#f[o`eKa!iRlKfMIClInruGGD0?A[[<Pm/;AV."eg)T'&C)Wj,6VEEdop>S`RgO97HUW*=mr=3Rf0AYu2_E)GT$'+VgCS3$Ac@\fLX-u9r9LO4^V+Dp>i`W?LrUT[`fJX@LV!EI0a8tU@#SNI@>lcPJM@BL)IZ<2@E"TiNZ>kT3"$d9KfG9I)8oY+=fubOPMg35]!p*4YP*,=IF6oQEHhp'72R6dBII?uN'W;n0I@q&@n\>.ageY]>E!uU^3QOLf]TfKo,?\0o7[iC6&D7B,8ahtP6,0+Ai.Al(4=9/pq)QkJRVsfrM*_JN3JU/EW42jQgH6,E*!or`1-^W@#`lKL2b0Cqj(H0f^niEb1/OP]iW37Zfes[CUJ'Z>/"5'I#m8.lT7.SGf(#:oJA3B^"LBJl#Q^(/[)[LFLna.C[oF$VgU,^'\OScL<CZYohA>Q4JWn(@#SNI@>lcPJM@BL)IZ<2@E"TiNZ>kT3"$d9KfG9I)8oY+9+GsK/dfU#E>K?t"`]k8p]nZc%<5G\gfO$tM7?'kr-.[l$1PD+G7A&"ERl#s\#<XMkM2ER/FQI'T*K>b7s>t^4Wg<jU<1+C8&g2<Xc5^/Ou#7$d$g>`LCq1(QkL$A0r>Z'`Z/-6(u`[N'7d@M3)bYn6(CB'KWsVY!oU.kQP"4<nF_'GWBo5+S_MFoJp=u\:em@gFiQiC;q_ido%1,?K=pJl4sN,qLOC:$1H&a(&8+jYq_)30[as.YGjO9kU,;_l628([H;dV7#35&3S&=+UEaABu^O'=5j4gu3N7b7RLhA,L*f]RagR^_*I$[@T?G+'ss*DLGg_WOHB@?5I2f&%Q]:6LsXqQfVf@K:=]n#!a4iZO1`*ojXq#BfSYs)NlU\Xr)q#?p-f)Bf'K7+o2GN[N/[H+7OP3L"3iV*AgrA)+:1Fgb"I=]Q%*N.)(=L@=rNgB!Xoeu>YI@/(4]Df(5gJ1MPRH;#@PBY&BhlZfPcA;Qa:"NAd.7N^n0ul;c3,[X'_Al.-39@3F@Tego?ZPJ&D&!DM'8tatUsMVrbt!RPT0'LEC,_KJi_1iYN\H21Fb[E,S9LFP?8n>/4uDhfismWY']Mjnei)P`:%JU/pnu(1l[`]8\/2:OpAD_?l_$JFja2t?H]<tq=2ZT?IGf&i!!!i6R-akQk7_Z]MSmIPkq!$@/LX"d>2:\`do;8`0`oKS)Qp*h.ghlLg<$$mJfAA1L_3NBW\dtMOX?6F!`DWoE?/o,D/Y&`ak5B<Up`P%f3]'5ioZar\&!q`9tg)84ur-c'(#MQWL4&t5I?*Gl1Vh<GJ3J*IHTL:lL]MmH/Ftf(iKd2f7S[m>T0Hbs!rP$r7S4LoK6`8#3>jH=1b);`+Rp9_r00q#,D"6SX[iip*B+9p65+'@t&LOnkUp%s',Kc>E/38($AemT17KU:IYj5\nZqd+bH.a]`X3J6@UnK\nC1HT2\10YYR5M?\s?Th*/:n*]O8,8^%G?'/9<R<j54JKnN`bb8MIGqSciXPY#@l\LRIMbg+E_?f?^+g,bWOgctY\em8T@ak49ukQK_oh+1Af"3cR!pXO8]bIctESTLeCe+uHD\Eu$-?5;$d&[[7R^`V(N87.f`88.&^M%OkFJ;QdGG-;$I>d6u3_6:Qg=#4X*5]R!Q[Rpi[FShnMe[]7N#H-pu+5W]FFn)FP2>8)LOV+m2gm(@[pmRo=0/6eK]V:mujI3J=_OLAMF&uB0S`UjIQpVFbR8Ye\2mL!\T;POiVll5):_V4i!'Lq!V-hs>n-9]1SI?\BW2#sXY3#p8+!_m-SLJseo1fh6/qHOJin<hoJIDGd3e-8N2VKWYW#Ei.bmWZkaY>1(iZ^GuRre;Y>Bfm(7nMGjqr2.f$L1c^T7nM.DQ_5-Y.Lo!o47-'F"A/DKF,tG$ciGPJ)`2_-l-p4P(2..$aQMRVT5T\$k)HBi&AN^j:.Uj*nh5a9l"YB*A>lYNj%;XPtce!`CC]aQ0O0ga$tnX3LL15msofo"MW)uHM_LBE\nOe&ma9#3]RGlUK-YEN0]c<p!$)9:qVM^M_"9F)=Jo?^08l-U*?]UCJah@)baRbe)9rlYf#oigtTe[K$m,$Xpc9s@X@qGg3p*KcPZNC[)JmP<9?qo&*1%+q2Z^UWYS2p?a-[unOlAelJ(bU#4cp!fiN:lF"W6:j!(%85r.V"EZJ%#_=7mT&LRs7lG'MBUjZpNX`d$DRcrMXSmpU?*N&uF*MtHP,*^+-mC/f:5'HY!ME>H[+f1W$pKZ<JNZ6]41L#8ZVh>BtU=B5oH10F'VY$6:`U!pZiB$p6SW't!Eo%W^U3$?%7RTDhifc*Hj*8_:\L=WmHtu1Wn-Yt3r&l2jZVF[Zan*/JG;S'dVFVufp,*6ff%"%&nI@b`qj_5"SuGXE[=LI26CRIG-(\(FUA=T<"%?8q%1:P)2[&Y2^6DUiJVuXGr$I!DpolgPO+?P^1EtgV'=76N-j76o+!\CHU93*bb&2n"]DDi!Air2FE9#1XW"39l6u6!3K3\qY:+B3^L(?ROfq"7H-8q7i1CA.l8QH%7ODD!V/9]K"J1Va(;$$J+"fc=E'WN<O`/pZ4Z$H34W[m/OI(T_EYd6@cd!n._lbH$m;\-C#'h0>kREu#Ci<llH:_1IQYH@lFZ8G6=U0k]%R:)9`lDop_Uk_P(9_<rHC^bbd$CB(M/m!pYer/5YSh^cuK=OnJMPQ#c+WEY+PW5`1jUEVhg4n$jX1UE\RDV-I<i.2<TA'W4VoICierMi>c$7(i"r_W?feM@g81l]hOBGFY1=]6,5FNe8F2uaB<>CLN6C$A]e9%6u%bMso-jEdZ5!P@t_2XRrn[e$,p*kk).1\rr2]NaIlr&l0S.KVJOtQq`OV+<'S+N+_WmK#G\-2,.qEVDYKZr.ch"VgL>hOd.?q'9e6.pNQqD5PhDCsDR6Mc>R5M&<$cec3!76^BR1=;3nF8p>$m5-93q"7du#+]X0/7#/gmIooUR9B<Fh9uG:4aT<CJL8N7JUW>/*(m%aD<oiXjY4&=G3oMJSgIJH?)#(R#S-@q(#K:%'7jX&o.jr_"9&-OET%LO8@.-h)Km!:+FO-a14b@Q$>q[<3*A&f[MBXoCOk^]&j_pf1k-0\(ptKd5N^$aO6.l\7o2?Jg=pk2HWL8.22RoQa9)u_bT-!'p1sc\NP@7a]lpZBHB<RJ<"IJfB\[WMJ%JZh?hp?Sg&rOTG$VQ?@Xg3O2Yh5He[2Z5Xn_F)CZ8/c'%Uj_QtNf2T/rNS6E5O%BC2/;+@D`52,kt!/YMOQ&W<-!<96HW]mBTs@Q@0+B)P^bYBh\5Nh-M72*W=7J=HF(a&I7a$j);=YVH&c]u#W9HAK)(iqR-dOs=,pOo(]^a#:mFi0B.>BY0@kD]K&^fCg/LZ5XB7VcE?-DFNoW_r_2&aOY6mf@pKiZZtAP^YP7up\fA"'2o&3S`$#CE(MYg-MN*p8&6pi_*!)Y.fKb!rX#^C#p$i/CA90BHkV9e.0[ON[^<clITP5LYK$H.Qeh[Tk(W?u)jngn\_$XMlYa0+F`78fS&3<<8fY7q;fFlk176ak-E"sMr^k*),ekVMnk1XXQ2)d$,l[d7?]ngs:*8Pfnk).;p%]koP%/r45KPa3!<7s@s8N'!:GLVjKc!Y'6sqP(MMd:j!shMV!s8W-$jd"<e2eVK!s>6M*=dil+CK:P6%D2)9a:n:.$siOq#c!2Q@$)add<q[P=^5X(>W3'(#9JX-b:LMRIjJTJ`si3Om&M&F6KmmFk[_[a4L$`1hr9<*L]Qa#.s_3R/#t_nIJS[*0u(ddLT4;W#+_O9*_DBJ8s_$G(\9jN-IjKn4rJ!%oCRdi*JG']YWkZ6+h!MGdjV7a&r(k:Fm%$hjjQ5058b`('9fT<\hP>X>Pm>fNT#!Z$$Vsi#I0#UM.[RcqZ3Gd*=)XN'dTc/igp*A<fG7(5T1/<J%KARiPq=FuI4ilog923(Eu/F<);i:*,",#!Rc]0na!Q$%[*A=-m&Bb`Bel:\mkjNV.&OlQ#SC`5LR1`5KU4jM]!T!"8A4m/R(c!!r\CTR["B&c8d3!!!(X!!$BAm/R(c!!r\CTR["B&c8d3!!!(X!!$BA)ZTg;'O[\)0N!F\.)A@["nVs8f$O#]#?DCh8;)=gTF$7YaNA.fKY^Po(b"oI,6.iMqIo,G"(VQQdBa$H!!NJsrr<$!fA5fn(i`@;r9*Uc&;9I!58aha+V^&*U(fqY'OQZThY4@^^$g;0)iDl6pQb2o6nWi`:Hu!-S/:k]k3#[lhrgg6l)KXhht'G:oX.,OC&+S'&-9In5Q8?!jB:4%*"*5kD_*<&\?Odkd_fp1$OrAU/2butYglik_bFMe%2FV\1Y*HPbueU0/Rb$NQ%i2Seeni(=CWm:;2DHqL(a&$0TQq$..OR08id32.8>c<Ql/AXEf2BOaAF$-76ji)Ri-fWGnR'u9oRq[]SP'[VT*OBG+iXt3j?`e8_+iZ.C"u/KJWctBL\I/@k]6g.DDTf9oQoWCGH`JN1bY$.n$i*NG5Yb1CIkS*mj:4P#!8B#jt3NZ'`_\kg"kG>rFJ?XbI;47s*D.)2.Vm%Hq,fChC[sqCeN<YDV,O_?L!h$m<fRQ/lpS)AjX((lsJMiJ&2<X+_-&UCo:=0Ol"T4tu"4@XH-&&n30hV@$f[A#:/R(1Lk[jpkm5A*+hA&RlLVQ3h%:A!S!A'k12IedZ+iiX?i:C^.q``b[!?2T9JJljh'P`p>=r4H8\K#9URgE/,!D'Q?PZ-QI*GE2O=f(3!n)(ERPYE0h/U'l[4l2]Zkh-M,5bmPa$A$B-Uq(k'<ZCs"YDo2s\-,K)\H4;5^sa#l/s/&](.]Gnl'a1OLQ-cCB;Hl'eM.'U2"lH.*Cp5o6NYl1@bpjUXEBglRW&H/Chh[eIk)Ioc"#TuA421\8LW7(XK@0#.U_Z=#N=FFP3pd<tedmN(j6f<@E_r4Ge,c9qY!"Q-qs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'!#m%p0J,fQLqlg$m!!8#i!+3MAs8N'![Sm/.s8O;g
rcs0 --- WA context = 0x00000000 ffffe000
:bj:jE+Fjp+?f?YP$5u'Z'OaI(!d:e<"*%C7"9NUF<q_Jg\M"]lecG2D#AM3iV9>CSk%16=3"9;0ItnHhnZr"/$='W)TklZqIU%5s&JNLE&J5Te`VC)Xrr<$&mJm@hs76X(
bcs0 --- HW Status = 0x00000000 fffc8000
:_<d\:!!!H/pRoA'#7kZI!!!!5!!$p+"98B$!!!Q1s8W*"
vcs0 --- HW Status = 0x00000000 fffc1000
:_<d\:!!!H/pRoA'#7kZI!!!!5!!$p+"98B$!!!Q1s8W*"
vecs0 --- HW Status = 0x00000000 fffba000
:_<d\:!!!H/pRoA'#7kZI!!!!5!!$p+"98B$!!!Q1!!!!"
Num Pipes: 3
Pipe [0]:
  Power: on
  SRC: 0d6f059f
  STAT: 00000000
Plane [0]:
  CNTR: c4802000
  STRIDE: 000000d8
  SURF: 02640000
  TILEOFF: 00000000
Cursor [0]:
  CNTR: 00000000
  POS: 00000000
  BASE: 00000000
Pipe [1]:
  Power: on
  SRC: 00000000
  STAT: 00000000
Plane [1]:
  CNTR: 00000000
  STRIDE: 00000000
  SURF: 00000000
  TILEOFF: 00000000
Cursor [1]:
  CNTR: 00000000
  POS: 00000000
  BASE: 00000000
Pipe [2]:
  Power: on
  SRC: 00000000
  STAT: 00000000
Plane [2]:
  CNTR: 00000000
  STRIDE: 00000000
  SURF: 00000000
  TILEOFF: 00000000
Cursor [2]:
  CNTR: 00000000
  POS: 00000000
  BASE: 00000000
CPU transcoder: A
  Power: on
  CONF: c0000000
  HTOTAL: 0e0f0d6f
  HBLANK: 0e0f0d6f
  HSYNC: 0dbf0d9f
  VTOTAL: 05c1059f
  VBLANK: 05c1059f
  VSYNC: 05ac05a2
CPU transcoder: B
  Power: on
  CONF: 00000000
  HTOTAL: 00000000
  HBLANK: 00000000
  HSYNC: 00000000
  VTOTAL: 00000000
  VBLANK: 00000000
  VSYNC: 00000000
CPU transcoder: C
  Power: on
  CONF: 00000000
  HTOTAL: 00000000
  HBLANK: 00000000
  HSYNC: 00000000
  VTOTAL: 00000000
  VBLANK: 00000000
  VSYNC: 00000000
CPU transcoder: EDP
  Power: on
  CONF: 00000000
  HTOTAL: 00000000
  HBLANK: 00000000
  HSYNC: 00000000
  VTOTAL: 00000000
  VBLANK: 00000000
  VSYNC: 00000000
is_mobile: no
is_lp: no
is_alpha_support: no
has_64bit_reloc: yes
has_aliasing_ppgtt: yes
has_csr: yes
has_ddi: yes
has_dp_mst: yes
has_reset_engine: yes
has_fbc: yes
has_fpga_dbg: yes
has_full_ppgtt: yes
has_full_48bit_ppgtt: yes
has_gmch_display: no
has_guc: yes
has_guc_ct: no
has_hotplug: yes
has_l3_dpf: no
has_llc: yes
has_logical_ring_contexts: yes
has_logical_ring_preemption: yes
has_overlay: no
has_pooled_eu: no
has_psr: yes
has_rc6: yes
has_rc6p: no
has_resource_streamer: yes
has_runtime_pm: yes
has_snoop: no
unfenced_needs_alignment: no
cursor_needs_physical: no
hws_needs_physical: no
overlay_needs_physical: no
supports_tv: no
has_ipc: yes
i915.vbt_firmware=(null)
i915.modeset=-1
i915.panel_ignore_lid=1
i915.semaphores=0
i915.lvds_channel_mode=0
i915.panel_use_ssc=-1
i915.vbt_sdvo_panel_type=-1
i915.enable_rc6=1
i915.enable_dc=-1
i915.enable_fbc=1
i915.enable_ppgtt=3
i915.enable_execlists=1
i915.enable_psr=0
i915.disable_power_well=1
i915.enable_ips=1
i915.invert_brightness=0
i915.enable_guc_loading=0
i915.enable_guc_submission=0
i915.guc_log_level=-1
i915.guc_firmware_path=(null)
i915.huc_firmware_path=(null)
i915.mmio_debug=0
i915.edp_vswing=0
i915.reset=2
i915.inject_load_failure=0
i915.alpha_support=no
i915.enable_cmd_parser=yes
i915.enable_hangcheck=yes
i915.fastboot=no
i915.prefault_disable=no
i915.load_detect_test=no
i915.force_reset_modeset_test=no
i915.error_capture=yes
i915.disable_display=no
i915.verbose_state_checks=yes
i915.nuclear_pageflip=no
i915.enable_dp_mst=yes
i915.enable_dpcd_backlight=no
i915.enable_gvt=no
JacekDanecki commented 5 years ago

What version of luxmark are you using? Have you tried to disable GPU watchdog to check whether it's real gpu hang or gpu workload is to big? To disable gpu watchdog you can run as root below command

echo N > /sys/module/i915/parameters/enable_hangcheck
isgursoy commented 5 years ago
JacekDanecki commented 5 years ago

What scene are you using in luxmark? What version of OpenCV are you using? Can you provide steps to reproduce gpu hang in OpenCV? Are you using Xorg/Wayland? What desktop environment/window manager?

isgursoy commented 5 years ago

Hi,

JacekDanecki commented 5 years ago

Can you compare BIOS settings on these 2 machines, especially power management settings and GPU configurations? Are there the same kernel parameters set in kernel command line (grub)? Do you have the same packages for power management installed on Ubuntu 16.04 and 18.04? Can you compare on both machines during luxmark test (before hang) as root in file /sys/kernel/debug/dri/0/i915_frequency_info information about GPU frequency (if Intel GPU is configured as 0 drm device)?

Lowest (RPN) frequency: 350MHz
Nominal (RP1) frequency: 350MHz
Max non-overclocked (RP0) frequency: 1100MHz
Max overclocked frequency: 1100MHz
Current freq: 1100 MHz
Actual freq: 1100 MHz
Idle freq: 350 MHz
Min freq: 350 MHz
Boost freq: 1100 MHz
Max freq: 1100 MHz

Have you tried to run clpeak?

As I can see there is also Nvidia card in this machine? Have you configured it also? Do you have nvidia or nouveau kernel modules installed/loaded? Have you enabled opencl for this card? What is the output from commands:

sudo lspci -v
ls -l /dev/dri/by-path
clinfo
MichalMrozek commented 5 years ago

Closing due to inaction, feel free to reopen.

isgursoy commented 5 years ago

I had not time to check yet. I will be back here with outputs soon. Thanks.