cosmoss-jigu / memtis

Tiered memory management
50 stars 16 forks source link

BUG: soft lockup - CPU#64 stuck for 187s! [rg:4628] #2

Open luckyq opened 9 months ago

luckyq commented 9 months ago

Hi,

I met a problem after I installed the memtsi. After I booted the system and configured the persistent memory, a bug happened.

`Message from syslogd@optane03 at Nov 1 20:48:52 ... kernel:[ 1573.073360] watchdog: BUG: soft lockup - CPU#14 stuck for 250s! [rg:4625]

Message from syslogd@optane03 at Nov 1 20:48:52 ... kernel:[ 1573.489358] watchdog: BUG: soft lockup - CPU#56 stuck for 250s! [rg:4630]

Message from syslogd@optane03 at Nov 1 20:48:52 ... kernel:[ 1573.509359] watchdog: BUG: soft lockup - CPU#62 stuck for 250s! [rg:4617]

Message from syslogd@optane03 at Nov 1 20:48:52 ... kernel:[ 1573.517359] watchdog: BUG: soft lockup - CPU#64 stuck for 250s! [rg:4628]

Message from syslogd@optane03 at Nov 1 20:48:52 ... kernel:[ 1573.529358] watchdog: BUG: soft lockup - CPU#67 stuck for 250s! [rg:4626]

Message from syslogd@optane03 at Nov 1 20:48:52 ... kernel:[ 1573.537358] watchdog: BUG: soft lockup - CPU#70 stuck for 250s! [rg:4621]

Message from syslogd@optane03 at Nov 1 20:49:04 ... kernel:[ 1585.297310] watchdog: BUG: soft lockup - CPU#30 stuck for 261s! [migration/30:195]

Message from syslogd@optane03 at Nov 1 20:49:04 ... kernel:[ 1585.457309] watchdog: BUG: soft lockup - CPU#47 stuck for 261s! [migration/47:297]

Message from syslogd@optane03 at Nov 1 20:49:16 ... kernel:[ 1597.305260] watchdog: BUG: soft lockup - CPU#31 stuck for 257s! [migration/31:201] `

It keeps reporting this to the terminal.

luckyq commented 9 months ago
Screenshot 2023-11-01 at 20 51 02

This is the dmesg report.

Tmichailidis commented 7 months ago

I run into the exact same problem. Is there a fix for this? @luckyq @multics69 @skmonga @madhavakrishnan @taehyung-lee

m8 commented 6 months ago

I'm also getting the same error.

luckyq commented 4 months ago

I run into the exact same problem. Is there a fix for this? @luckyq @multics69 @skmonga @madhavakrishnan @taehyung-lee Emmm, don't use remote-ssh in vscode or other extensions... Make sure to kill all other processes.

DanielLee343 commented 2 weeks ago

I also experienced this bug when setting DRAM size to some specific size ranges. Worst thing is, the kernel keeps spinning that somehow rejected ssh (user space procs) to be launched. Only rebooting would temporarily fix this. Also see #9