OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
6.26k stars 1.48k forks source link

MIPS32 24K / 24KE TARGET #2564

Closed bhack closed 4 years ago

bhack commented 4 years ago

Is there a target for MIPS32 24K / 24KE? http://en.techinfodepot.shoutwiki.com/wiki/MIPS32#MIPS32_24K_.2F_24KE

bhack commented 4 years ago

Same problem with Openwrt (19.07.2):

{standard input}:209: Error: opcode not supported on this processor: mips32r2 (mips32r2) `rdhwr $2,$30'
{standard input}:233: Error: opcode not supported on this processor: mips32r2 (mips32r2) `rdhwr $2,$30'
{standard input}:319: Error: opcode not supported on this processor: mips32r2 (mips32r2) `rdhwr $2,$30'
bhack commented 4 years ago

It seems to compile only with USE_THREAD=0 when we have the unmodified "rpcc" macro in common_mips.h

brada4 commented 4 years ago

Could you post any kind of boot log? We need to know command(s) that led to particualr output.

bhack commented 4 years ago

I don't have the card available now. All the things I am testing now are just crosscompilation for that target with fresh Ubuntu 18.04 Docker with a fresh Openwrt 19.0.2 toolkit. I've tried with USE_THREAD=0 cause I saw that your native test was a single thread build at https://github.com/xianyi/OpenBLAS/issues/2564#issuecomment-615961972

bhack commented 4 years ago

Sorry the link to your build was at https://github.com/xianyi/OpenBLAS/issues/2564#issuecomment-615957300

brada4 commented 4 years ago

Trying for multiple CPUs for openwrt. I think pthread lib is libc for musl, at least that has symbols.

\" -DCHAR_CNAME=\"\" -DNO_AFFINITY -I..  -shared -o ../libopenblas_p5600p-r0.3.9.dev.so \
-Wl,--whole-archive ../libopenblas_p5600p-r0.3.9.dev.a -Wl,--no-whole-archive \
-Wl,-soname,libopenblas.so.0 -lm -lpthread -lm -lpthread
/usr/bin/ld: cannot find -lpthread
/usr/bin/ld: cannot find -lpthread
collect2: error: ld returned 1 exit status

ath79 cross-compilers had no problems building single-threaded, same link error for multi, could be something particular with ramips, which in particular has less isa extensions than ath79, even though claims MIPS V5.5 where mine says just V5.0

martin-frbg commented 4 years ago

-lpthread is currently added unconditionally on MIPS (Makefile.system, exceptions there just for Android and Windows). Not sure how to find out we are going to link against musl eventually.

brada4 commented 4 years ago

My build command that works after removing fpu=2008, there is no invalid assembly with 2 CPUs, just that -lpthread is not found @bhack - you should patch that out for OpenWRT package, also have in mind arm arches have dynamic arch, and probably everyone tests on x86 where it is a lot more of them. make TARGET=P5600 CC=mipsel-openwrt-linux-gcc FC=ERROR HOSTCC=cc NUM_CPUS=1 I did set up "SDK" as follows.

export PATH=~/openwrt/openwrt-sdk-ramips-mt76x8_gcc-8.4.0_musl.Linux-x86_64/staging_dir/toolchain-mipsel_24kc_gcc-8.4.0_musl/bin:$PATH
export STAGING_DIR=~/openwrt/openwrt-sdk-ramips-mt76x8_gcc-8.4.0_musl.Linux-x86_64/staging_dir

@martin-frbg musl does not define anything specific, it works with ln -s libc.so libpthread.so (idea taken from xcode/android), does not leave libpthread in imports (not much debug symbols in files, I just deleted the link), Alpine has no such symlink.

bhack commented 4 years ago

@brada4 NUM_CPUS=1 works without USE_THREAD=0

bhack commented 4 years ago

@brada4 Ops sorry I was on https://github.com/xianyi/OpenBLAS/pull/2565

brada4 commented 4 years ago

@bhack you can work around the thread issue with symlink as above (ln -s /lib/libc.so /lib/libpthread.so) I think not mine nor your SoC has 2-processor thing in extant series. But threre are lots of OpenWRT ARM routers, and OpenWRT x86 test machines, where pthreads are of use.

Actually it is written in FAQ that most of the times if no CPU count is specified , the core count is detected from build system, not from target.

I tried with USE_THREAD=0, still no wrong instructions, it turns out cc was clang, switching it to gcc, also no bad instructions. What is so different in the way we approach the compilers? I dont use any special tricks, it just works as I type?

bhack commented 4 years ago

I supposed that the Macro removed the assembly but it is not case cause it is working now on https://github.com/xianyi/OpenBLAS/pull/2565

brada4 commented 4 years ago

@bhack thinking of file naming - best if you call resulting library libblas.so and libcblas.so , so it is replacable at no pain with reference netlib blas (requires fortran) or atlas (needs native compilation that involves hours to days of target calibration at least once)

brada4 commented 4 years ago

I did not change any macros, just addressed obvious problem striping out fpu=2008 because of ldcomplaining about ABI compatibility, and then added libpthr symlink because of another complaint.

bhack commented 4 years ago

I was working on #2565 branch and there is a new #if !defined(MIPS24K) for the assembly. But seems that it is not working cause I still have the assembly after postprocessing the Macro and now it compiles without special flags.

brada4 commented 4 years ago

What command you use for build? I never needed assembly fixups , even with ramips openwrt sdk, actually the resulting library came same size with ramips and ath79 toolchains and ath79 native.

bhack commented 4 years ago

The Makefile stub of Openblas is at https://github.com/openwrt/packages/pull/11894. I cross compile in OpenWRT make package/libs/openblas/compile V=s -j

martin-frbg commented 4 years ago

@brada4 I think you are not actually building for TARGET=MIPS24K, which would explain the differences ? @bhack the new #if defined(MIPS24K) only removes the RPCC call with its unsupported hwrdr instruction, but certainly not all assembly. (The error you saw earlier about amax.S was just because the build was not picking up the correct files for the MIPS24K as the name of the KERNEL file did not match)

bhack commented 4 years ago

@martin-frbg What i mean is that:

cat ./build_dir/target-mipsel_24kc_musl/OpenBLAS-0.3.9/common_mips.h
static inline unsigned int rpcc(void){
  unsigned long ret;

  __asm__ __volatile__(".set   push    \n"
          "rdhwr %0, $30  \n"
          ".set pop" : "=r"(ret) : : "memory");

  return ret;
}
#define RPCC_DEFINED
martin-frbg commented 4 years ago

Yes, that section should be removed by the preprocessor (and I expect that does happen during the build, if the compiler no longer complains about that instruction).

bhack commented 4 years ago

I think this is the one already relsoved or not:

#ifndef ASSEMBLER

static inline unsigned int rpcc(void){
  unsigned long ret;

  __asm__ __volatile__(".set   push    \n"
          "rdhwr %0, $30  \n"
          ".set pop" : "=r"(ret) : : "memory");

  return ret;
}
#define RPCC_DEFINED
bhack commented 4 years ago

Cause the macro was:

#ifndef ASSEMBLER

#if !defined(MIPS24K)
static inline unsigned int rpcc(void){
  unsigned long ret;

  __asm__ __volatile__(".set   push    \n"
          "rdhwr %0, $30  \n"
          ".set pop" : "=r"(ret) : : "memory");

  return ret;
}
#define RPCC_DEFINED
#endif
brada4 commented 4 years ago

I built with same openwrt sdk you said does not accept this instruction, what command you use to build, and what gcc gets used? I see no reason for particular change.

bhack commented 4 years ago

@brada4 I think we are not using that instruction @martin-frbg explained the behavior see https://github.com/xianyi/OpenBLAS/pull/2565#discussion_r410948822

brada4 commented 4 years ago

I am under belief that target machine evenly irradiates my build system and your build system, so behaviour of as and cc1 interactions should be identical. Absence of particular co-processor should give SIGILL or similar at runtime if at all.

bhack commented 4 years ago

So on an ubuntu host can you cross-compile with make package/libs/openblas/compile V=s -j on master hash?

brada4 commented 4 years ago

I know nothing about packaging system, I am just telling that particular SDK compiles OpenBLAS with minimal changes, no idea what is missing.

bhack commented 4 years ago

@brada4 How we can reproduce the same execution? Can you share a Dockerfile?

bhack commented 4 years ago

The official openwrt image where you can compile for that arch is docker run --rm -it openwrtorg/sdk:ramips-mt76x8-19.07.2

bhack commented 4 years ago

Basic commands are at https://hub.docker.com/r/openwrtorg/sdk

brada4 commented 4 years ago

in include/toplevel.mk

# make sure that a predefined CFLAGS variable does not dist
export CFLAGS=
export LDFLAGS=

zap both lines and replace with

unexport CFLAGS LDFLAGS

Thats a bug in openwrt SDK that makes it different from plain make with SDK compiler. You can probably unexport those in package file, if that succeeds plese review validity and need of all changes here. PS I am against using multi-gigabyte docker 'microservices'

commodo commented 4 years ago

/cc @commodo if you are interested for the openwrt part of the thread.

[we had easter this weekend :) so a bit slow on the reply] openblas is a bit far from my current interest; i am bit interested in numpy atm; maybe later; after i get to take a look at numpy + openwrt;