JuliaPerf / LIKWID.jl

Julia wrapper for the performance monitoring and benchmarking suite LIKWID.
https://juliaperf.github.io/LIKWID.jl/dev/
MIT License
58 stars 2 forks source link

Hidden version requirements to liblikwid #56

Open jonas-schulze opened 10 months ago

jonas-schulze commented 10 months ago

I just saw your talk from JuliaCon 2023 that also mentioned LIKWID.jl and I was eager to try it out. :rocket:

Unfortunately, trying the very first tutorial, I got a segfault. After some digging, I found the culprit: the definitions of CpuTopology between LIKWID.jl and the underlying liblikwid differed. My likwid.h version 5.1 (installed via apt) defines the CpuTopology struct without the numDies field, while LIKWID.jl version 0.4.4 defines its CpuTopology struct having numDies in position 4. Consequently, the fields are shifted when transferring the data from C to Julia: Julia's numDies contains C's numCoresPerSocket, ..., numCacheLevels contains the value of threadPool, which is a pointer. Wrapping cacheLevels (C's topologyTree) into an array of length numCacheLevels (C's threadPool) leads to the malformed memory accesses. :boom:

likwid.h: https://github.com/RRZE-HPC/likwid/blob/v5.1/src/includes/likwid.h#L370-L380

typedef struct {
    uint32_t numHWThreads; /*!< \brief Amount of active HW threads in the system (e.g. in cpuset) */
    uint32_t activeHWThreads; /*!< \brief Amount of HW threads in the system and length of \a threadPool */
    uint32_t numSockets; /*!< \brief Amount of CPU sockets/packages in the system */
    uint32_t numCoresPerSocket; /*!< \brief Amount of physical cores in one CPU socket/package */
    uint32_t numThreadsPerCore; /*!< \brief Amount of HW threads in one physical CPU core */
    uint32_t numCacheLevels; /*!< \brief Amount of caches for each HW thread and length of \a cacheLevels */
    HWThread* threadPool; /*!< \brief List of all HW thread descriptions */
    CacheLevel*  cacheLevels; /*!< \brief List of all caches in the hierarchy */
    struct treeNode* topologyTree; /*!< \brief Anchor for a tree structure describing the system topology */
} CpuTopology;

Liblikwid.jl: https://github.com/JuliaPerf/LIKWID.jl/blob/v0.4.4/src/LibLikwid.jl#L640-L651

struct CpuTopology
    numHWThreads::UInt32
    activeHWThreads::UInt32
    numSockets::UInt32
    numDies::UInt32
    numCoresPerSocket::UInt32
    numThreadsPerCore::UInt32
    numCacheLevels::UInt32
    threadPool::Ptr{HWThread}
    cacheLevels::Ptr{CacheLevel}
    topologyTree::Ptr{treeNode}
end

The numDies field has been added in https://github.com/RRZE-HPC/likwid/commit/a0ac14d2619222665df6afa7491a5958ff34f03e, which according to GitHub shipped in likwid version 5.2 and onwards. Moving the numDies field to the end of the C struct is not an option, I guess, and removing numDies from LIKWID.jl as its not used anyways. Therefore, I would recommend:

jonas-schulze commented 10 months ago

Follow-up question: are the .init_fileTopology field and the mentioned static void initTopologyFile(FILE* file); function dead code?

carstenbauer commented 10 months ago

I'm sorry that you had such a bad first experience. I made the choice to only support LIKWID >= 5.2 but didn't document it properly. My bad.

  • Document the version requirements for liblikwid
  • Let LIKWID.jl fail gracefully, if it recognizes an unsupported version of liblikwid

Agreed 👍

  • Let LIKWID.jl bundle its own liblikwid

That'd be great but it's difficult because LIKWID itself isn't portable yet. See https://github.com/JuliaPackaging/Yggdrasil/pull/4913 and the links over there.

TomTheBear commented 10 months ago

Follow-up question: are the .init_fileTopology field and the mentioned static void initTopologyFile(FILE* file); function dead code?

Hi, this is not dead code. LIKWID provides likwid-getTopoCfg to create a topology file than is then read in with initTopologyFile(). Extremely helpful on systems where the topology lookup takes minutes like HPE/SGI UltraViolet (>1000 CPU sockets).

For the CUPTI stuff in LIKWID, I use versioned structs based on the CUDA/CUPTI version.

It can be expected that with every x.y.0 release there will be changes in the API, so with 5.2.0 or 5.3.0 new struct members and functions were added.