Closed edmBernard closed 7 years ago
Could you please build https://github.com/Maratyszcza/cpuinfo (confu setup && python configure.py && ninja), run
bin/isa-infoand
bin/cache-info` and post their output here?
isa-info :
Scalar instructions:
LAHF/SAHF: yes
LZCNT: yes
POPCNT: yes
TBM: no
BMI: yes
BMI2: yes
ADCX/ADOX: no
Memory instructions:
MOVBE: yes
PREFETCH: no
PREFETCHW: no
PREFETCHWT1: no
CLZERO: no
SIMD extensions:
3dnow!: no
3dnow!+: no
SSE3: yes
SSSE3: yes
SSE4.1: yes
SSE4.2: yes
SSE4a: no
Misaligned SSE: no
AVX: yes
FMA3: yes
FMA4: no
XOP: no
F16C: yes
AVX2: yes
AVX512F: no
AVX512PF: no
AVX512ER: no
AVX512CD: no
AVX512DQ: no
AVX512BW: no
AVX512VL: no
AVX512IFMA: no
AVX512VBMI: no
AVX512VPOPCNTDQ: no
AVX512_4VNNIW: no
AVX512_4FMAPS: no
Multi-threading extensions:
MONITOR/MWAIT: no
MONITORX/MWAITX: no
CMPXCHG16B: yes
HLE: no
RTM: no
XTEST: no
RDPID: no
Cryptography extensions:
AES: yes
PCLMULQDQ: yes
RDRAND: yes
RDSEED: no
SHA: no
Padlock RNG: no
Padlock ACE: no
Padlock ACE 2: no
Padlock PHE: no
Padlock PMM: no
Profiling instructions:
RDTSCP: yes
LWP: no
MPX: no
System instructions:
SYSENTER/SYSEXIT: yes
RDMSR/WRMSR: yes
CLFLUSH: yes
CLFLUSHOPT: no
CLWB: no
FXSAVE/FXSTOR: yes
XSAVE/XSTOR: yes
FS/GS Base: yes
cache-info
L1 instruction cache: 32 KB, 8-way set associative (64 sets), 64 byte lines, shared by 4 processors
L1 data cache: 32 KB, 8-way set associative (64 sets), 64 byte lines, shared by 4 processors
L2 unified cache: 4 MB (exclusive), 16-way set associative (4096 sets), 64 byte lines, shared by 4 processors
@edmBernard Your virtual server is supported, it should work with NNPACK. However, NNPACK code is quite simple in detection of these pre-requisites, and I don't immediately see the problem in it. The relevant code is in https://github.com/Maratyszcza/NNPACK/blob/86bfc32972e02781e7e0393faf556aff4fbc8cc3/src/init.c#L121, and if you figure out where is the flaw in the logic, I will fix it.
Thanks for your help I'll test it as soon as possible and report back.
I investigate a bit more on my problem. It seem to come from the lack of L3 cache. The device configuration is 4 separates and independentes processor with one core each.
Right, I didn't notice that at first. Processors without L3 cache are currently not supported in NNPACK.
You can follow #33 for this feature.
I try to install nnpack on a ovh server. Convolution test fail:
Device caracteristic should work there is avx2 available. here proc stats :
have you an idea why it failed ?