Maratyszcza / NNPACK

Acceleration package for neural networks on multi-core CPUs
BSD 2-Clause "Simplified" License
1.68k stars 316 forks source link

cache detectiong on arm/arm64 #183

Closed Danliran closed 4 years ago

Danliran commented 4 years ago

Hi NNPACK team,

The function of init_hwinfo detect the hardware cache info, but if the platform is ARM/ARM64, the cache info is hard code in the function. I think we should detect these info from system or cpuinfo .

if !(CPUINFO_ARCH_X86 || CPUINFO_ARCH_X86_64) || defined(ANDROID)

static void init_static_hwinfo(void) {
    nnp_hwinfo.cache.l1 = (struct cache_info) {
        .size = 16 * 1024,
        .associativity = 4,
        .threads = 1,
        .inclusive = true,
    };
    nnp_hwinfo.cache.l2 = (struct cache_info) {
        .size = 128 * 1024,
        .associativity = 4,
        .threads = 1,
        .inclusive = true,
    };
    nnp_hwinfo.cache.l3 = (struct cache_info) {
        .size = 2 * 1024 * 1024,
        .associativity = 8,
        .threads = 1,
        .inclusive = true,
    };
}

endif

Danliran commented 4 years ago

@Maratyszcza Should we need to implement a function for arm linux hwinfo. Some vendor designed CPU based on ARM core.

`static void init_hwinfo(void) {

if (CPUINFO_ARCH_X86 || CPUINFO_ARCH_X86_64) && !defined(ANDROID)

    init_x86_hwinfo();
#elif !CPUINFO_ARCH_X86 && !CPUINFO_ARCH_X86_64 && defined(__APPLE__)
    init_static_ios_hwinfo();
    #elfi CPUINFO_ARCH_ARM || CPUINFO_ARCH_ARM64
            init_arm_linux_hwinfo();
#else
    init_static_hwinfo();
#endif
   ............

}`

Maratyszcza commented 4 years ago

NNPACK assumes 3-level cache hierarchy, but many ARM CPUs have only two levels of cache. Thus, adapting NNPACK to use the actual cache parameters is not straightforward, and as I don't actively work on NNPACK anymore, there are no plans to introduce two-level cache blocking.