Closed jsmall-zzz closed 7 years ago
If I comment out the FMA detection line it crashes out on
vpavgb
Which require AVX512VL + AVX512BW which also don't seem available on this chip. If I disable the AVX feature being able cult completes without crash (but of course without AVX instructions).
Both should be fixed, thanks for the report!
AsmDB commit: https://github.com/asmjit/asmdb/commit/57809f6cba5b03e4a8e057a08cae5ea8d250bc4d
AsmJit commit: https://github.com/asmjit/asmjit/commit/3864b255e97d04e4ee54a3e315440266ef77f837
Let me know if it fixes your issues.
And thanks for testing it out. You can view instruction database and some preliminary results here:
Just tested latest and can confirm the problem is fixed.
https://kobalicek.com/asmgrid/ looks very handy.
From a usability point of view I find myself constantly looking to the A/B listing on the left. Perhaps having the a shortened processor name as a column heading spanning two sub column headings for Lat/Rcp underneath that heading might help with that.
I've run cult now on a few machines - might be nice to have a way to upload, and/or have an option within cult itself to upload when profile is completed automatically.
I've run cult now on a few machines - might be nice to have a way to upload, and/or have an option within cult itself to upload when profile is completed automatically.
There should be a copyright waiver as @kobalicek is planning to make the profiles public domain (as explained in the TODO section of the cult README).
@jsmall-nvidia I would love to have group header in the table header, but unfortunately the component I'm using (SlickGrid) doesn't provide that. Until I find a good replacement I would just keep it this way.
If you have interesting outputs I can add it to the view, but I wanted to improve cult
a bit to handle more instructions before I start collecting the data.
Anyway thanks for letting me know about these, I'm definitely considering all of the requests except automatic uploads of data, this I will keep as is.
@jsmall-nvidia I found a plugin for SlickGrid that provides the grouping, the app now groups by CPU.
First - I love this project :) Having a way to find the perf characteristics for a system by running a tool, as opposed to looking up in a doc, and having to trust what's there is great.
I think there may be a problem within the CPU detection code. I'm testing on i7 3820. If I run cult it crashes on vfmadd132pd instruction.
If I look up that instruction on intel intrinsics website - it says it needs the FMA flag. If I lookup that CPU on cpuboss it claims that processor does not have FMA3.
The detection code detects FMA (presumably FMA3) is available.
If I look at this CPU id code https://github.com/klauspost/cpuid/blob/master/cpuid.go
The identification code is...
if c&(1<<26) != 0 && c&(1<<27) != 0 && c&(1<<28) != 0 { // Check for OS support eax, _ := xgetbv(0) if (eax & 0x6) == 0x6 { rval |= AVX if (c & 0x00001000) != 0 { rval |= FMA3 } } }
The code in cult is...
so perhaps the line
if (regs.ecx & 0x00004000U)
should be
if (regs.ecx & 0x00001000U)