Closed trquinn closed 9 years ago
Original date: 2015-08-12 02:49:45
Bilge, please judge if its related to the broadcast change, and if not, reassign to Changa group.
Original date: 2015-08-12 14:21:40
It could be related to my recent broadcast change. (b1db2f25a931534c555aadd14740f8f6831bf9ae)
Tom, can you tell the parameters you're using for ChaNGa when you get this crash? I'll try to reproduce the issue.
Original date: 2015-08-12 19:02:13
I used "git bisect" to figure out which commit caused the problem. It tells me: login3.stampede(17)$ git bisect bad b1db2f25a931534c555aadd14740f8f6831bf9ae is the first bad commit
To reproduce on Stampede: 1) In charm, "./build ChaNGa verbs-linux-x86_64 smp -j4 -O2" 2) In ChaNGa, "./configure --enable-bigkeys; make" 3) Run ChaNGa with the attached job script, and param file. The data file is too large to attach, but can be found at: ftp://ftp-hpcc.astro.washington.edu/pub/hpcc/bench/hrwh_sbc_gas.tbin
The files are also on Stampede in the directory /home1/00333/tg456090/work/hrwh_sbc_g
Original date: 2015-08-17 19:59:45
Thanks Tom, so it's my recent change is causing the problem. I have reproduced the crash and working on fixing it.
Original date: 2015-09-10 22:01:07
Any progress on this? Having charm broken on the ibverbs platform is not good.
Original date: 2015-09-11 13:35:12
I'm going to look into this after my paper deadline this weekend.
Original date: 2015-09-14 17:56:25
Fix is implemented here: https://charm.cs.illinois.edu/gerrit/#/c/827/ https://github.com/UIUC-PPL/charm/commit/7f5f80087cdd1e4fb075e274bfe629357cbf9364
I've tested and Changa works fine now. Tom, can you test it as well please?
Original date: 2015-09-17 19:48:15
The fix is merged.
Original date: 2015-11-11 03:05:23
Set status back to Merged so that we can distinguish whether the code was changed in some way, or the fix was elsewhere. Both are non-open states from Redmine's perspective.
Original issue: https://charm.cs.illinois.edu/redmine/issues/803
The latest charm version (v6.6.0-317-g80fea48) crashes a ChaNGa run on stampede with:
Fatal error on PE 502> packet in the middle does not have expected length
An earlier version (v6.6.0-258-ge8e17df) works.