OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
6.19k stars 1.47k forks source link

MIPS32 24K / 24KE TARGET #2564

Closed bhack closed 4 years ago

bhack commented 4 years ago

Is there a target for MIPS32 24K / 24KE? http://en.techinfodepot.shoutwiki.com/wiki/MIPS32#MIPS32_24K_.2F_24KE

martin-frbg commented 4 years ago

Not specifically - you could try TARGET=P5600

bhack commented 4 years ago

Mhh I got this with P5600 on compilation.

{standard input}:207: Error: opcode not supported on this processor: mips32r5 (mips32r5) `rdhwr $2,$30'
{standard input}:289: Error: opcode not supported on this processor: mips32r5 (mips32r5) `rdhwr $2,$30'
{standard input}:330: Error: opcode not supported on this processor: mips32r5 (mips32r5) `rdhwr $2,$30'
bhack commented 4 years ago

It is a Mediatek MT7688 https://docs.onion.io/omega2-docs/omega2.html#omega2

bhack commented 4 years ago

I think that chip is a mips32r2. P5600 set mips32r5

bhack commented 4 years ago

Setting the TARGET=1004K that is -mips32r2 I got:

` getarch.c:1164:2: error: #error "The TARGET specified on the command line or in Makefile.rule is not supported. Please choose a target from TargetList.txt"

error "The TARGET specified on the command line or in Makefile.rule is not supported. Please choose a target from TargetList.txt"

`

martin-frbg commented 4 years ago

Sorry, that seems to be an omission in getarch.c (the 1004K target is basically the same as P5600 but with the compiler options set to -mips32r2 . Back when I added it getarch.c did not yet have the code to catch invalid target names, and it seems I need to dig out that little ubiqiti router again and confirm everything still builds).

martin-frbg commented 4 years ago

Can you try what happens when you hack the Makefile.prebuild and Makefile.system to change mips32r5 to mips32r2 (and perhaps remove the mtune=P5600 where you encounter it) and then build for TARGET=P5600 ?

martin-frbg commented 4 years ago

BTW what does /proc/cpuinfo contain on your system (assuming that it is some flavor of Linux or Unixoid that has /proc) ? The current check in cpuid_mips.c just tries to read the plaintext name rather than the numeric cpuid

bhack commented 4 years ago

Do you mean on the target machine? Cause I am cross-compiling from a x86 host.

brada4 commented 4 years ago

EDIT: an exampe from modern mips linux - isa/ase/opt are text fields.

Linux OpenWrt 4.14.171 #0 Thu Feb 27 21:05:12 2020 mips GNU/Linux

system type             : Qualcomm Atheros QCA956X ver 1 rev 0
machine                 : TP-Link Archer C7 v5
processor               : 0
cpu model               : MIPS 74Kc V5.0
BogoMIPS                : 385.84
wait instruction        : yes
microsecond timers      : yes
tlb_entries             : 32
extra interrupt vector  : yes
hardware watchpoint     : yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb]
isa                     : mips1 mips2 mips32r1 mips32r2
ASEs implemented        : mips16 dsp dsp2
Options implemented     : tlb 4kex 4k_cache prefetch mcheck ejtag llsc dc_aliases perf_cntr_intr_bit cdmm nan_legacy nan_2008 contextconfig perf
shadow register sets    : 1
kscratch registers      : 0
package                 : 0
core                    : 0
VCED exceptions         : not available
VCEI exceptions         : not available
bhack commented 4 years ago

@brada4 I see you are on openwrt. I am trying exactly to prepare a package feed for openblas. Do you have already a recepit?

system type        : MediaTek MT7688 ver:1 eco:2
machine            : Onion Omega2+
processor        : 0
cpu model        : MIPS 24KEc V5.5
BogoMIPS        : 385.84
wait instruction    : yes
microsecond timers    : yes
tlb_entries        : 32
extra interrupt vector    : yes
hardware watchpoint    : yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb]
isa            : mips1 mips2 mips32r1 mips32r2
ASEs implemented    : mips16 dsp
shadow register sets    : 1
kscratch registers    : 0
package            : 0
core            : 0
VCED exceptions        : not available
VCEI exceptions        : not available
brada4 commented 4 years ago

@bhack no yet, but I will try. You try too. Start with 'make' see if builtin tests dont fail, etc, btw there is no fortran which implies no reference blas either.

bhack commented 4 years ago

@brada4 It is a WIP if you want to try https://gist.github.com/bhack/60f61083d3264f4c1e11fbd7ff1225fd

bhack commented 4 years ago

@martin-frbg with-mips32r2 -mtune=24kc i got.

blas_server.c: In function 'blas_thread_init':
blas_server.c:585:95: warning: format '%ld' expects argument of type 'long int', but argument 4 has type 'int' [-Wformat=]
         fprintf(STDERR, "OpenBLAS blas_thread_init: pthread_create failed for thread %ld of %ld: %s\n", i+1,blas_num_threads,msg);
                                                                                             ~~^

{standard input}: Assembler messages:
{standard input}:207: Error: opcode not supported on this processor: mips32r2 (mips32r2) `rdhwr $2,$30'
{standard input}:289: Error: opcode not supported on this processor: mips32r2 (mips32r2) `rdhwr $2,$30'
{standard input}:330: Error: opcode not supported on this processor: mips32r2 (mips32r2) `rdhwr $2,$30'
Makefile:125: recipe for target 'blas_server.o' failed
martin-frbg commented 4 years ago

That error looks strange, are you sure that the assembler gets picked up from the cross-compile toolchain ? (I believe the rdhwr $2,$30 is generated by the toolchain compiler - though the definition of the "rpcc" macro in common_mips.h also contains a similar rdhwr instruction with a value of 30)

bhack commented 4 years ago

@martin-frbg It is quite easy to test From https://github.com/OnionIoT/source#using-the-docker-image

To prepare the toolchain and minimal build in Docker

docker run -it onion/omega2-source /bin/bash
./scripts/feeds update onion
sh scripts/onion-minimal-build.sh
make -j

When it is ready put your branch/PR commit hash in your Makefile like this scripts/package/libs/openblas/Makefile

include $(TOPDIR)/rules.mk

PKG_NAME:=OpenBLAS
PKG_VERSION:=0.3.9
PKG_RELEASE:=1
PKG_BUILD_DIR:=$(BUILD_DIR)/OpenBLAS-$(PKG_VERSION)
#PKG_SOURCE:=v$(PKG_VERSION).tar.gz
#PKG_SOURCE_URL:=https://github.com/xianyi/OpenBLAS/archive/
PKG_SOURCE_PROTO:=git
PKG_SOURCE_URL:=https://github.com/martin-frbg/OpenBLAS.git
#PKG_HASH:=28cc19a6acbf636f5aab5f10b9a0dfe1
PKG_SOURCE_VERSION:=00172d440bfc7dedc8523a4cdad58b685801bb76
include $(INCLUDE_DIR)/package.mk

define Package/openblas
  SECTION:=libs
  CATEGORY:=Libraries
  TITLE:=An optimized BLAS library
  URL:=https://www.openblas.net/
endef

define Package/bridge/description
  OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version
endef
# Package build instructions; invoke the GNU make tool to build our package
define Build/Compile
        $(MAKE) -C $(PKG_BUILD_DIR) \
           NOFORTRAN=1 \
           TARGET='1004K' \
           HOSTCC=gcc \
               CC="$(TARGET_CC)" \
           CFLAGS="$(TARGET_CFLAGS)" \
          LDFLAGS="$(TARGET_LDFLAGS)" 
endef
define Build/InstallDev
        $(INSTALL_DIR) $(1)/usr/lib/pkgconfig
        $(CP) $(PKG_INSTALL_DIR)/usr/lib/* $(1)/usr/lib/
        $(INSTALL_DIR) $(1)/usr/include
        $(CP) $(PKG_INSTALL_DIR)/usr/include/* $(1)/usr/include/
        $(CP) $(PKG_INSTALL_DIR)/usr/lib/pkgconfig/* $(1)/usr/lib/pkgconfig/
endef

define Package/openblas/install
        $(INSTALL_DIR) $(1)/usr/lib
        $(CP) $(INSTALL_BIN) $(PKG_BUILD_DIR)/usr/lib/libOpenblas.so* $(1)/usr/lib/
endef
$(eval $(call BuildPackage,openblas))

And then compile you could compile OpenBLAS with make package/libs/openblas/compile -j1 V=s

bhack commented 4 years ago

/cc @commodo if you are interested for the openwrt part of the thread.

brada4 commented 4 years ago

native build on usb chroot did not complain about invalid opcodes yet. rdhwr and 32 registers are mips1, fpu2008 is mips4, so mips5 should cover both I think something missing in a cross-compiler vs native compiler that assembler under the hood rejects valid instruction for architecture. Once the build is over (does not look like soon) I will report back with log.

More or less what I did:

Though not sure what is practical consumer that is C-only on 128MB RAM.

Since we are facing compiler ?bug? I think we cannot proceed much further. From the compiler output it looks like cc1 generated proper armv1 instruction while as says it does not get it. You get same if you want avx512 on ubuntu16 or rhel7

bhack commented 4 years ago

@brada4 The cross-compile openwrt docker has gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0

martin-frbg commented 4 years ago

From what I could find it is more likely a binutils bug in the toolchain - "read hardware register" should be a legal MIPS32R2 instruction, perhaps the assembler does not know the 24K cpu ?

brada4 commented 4 years ago

@brada4 The cross-compile openwrt docker has gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0

Ubuntu ships no mips compiler.

@OpenWrt:/openblas# gcc --version
gcc (OpenWrt GCC 7.4.0) 7.4.0
@OpenWrt:/openblas# opkg list | grep ^gcc
gcc - 7.4.0-5 - Build a native toolchain for compiling on target device.
brada4 commented 4 years ago

From what I could find it is more likely a binutils bug in the toolchain - "read hardware register" should be a legal MIPS32R2 instruction, perhaps the assembler does not know the 24K cpu ?

In addition fpu=2008 fails to link together with fpu=generic libc/libm.

martin-frbg commented 4 years ago

you mean -mnan=2008 ? Could be that requires mips32r5, I do not think our Makefiles use this in conjuction with 32r2 hosts anyway

bhack commented 4 years ago

The toolchain in the docker is build in "./staging_dir/toolchain-mipsel_24kc_gcc-7.3.0_musl/lib/gcc"

brada4 commented 4 years ago

So problem is with that toolchain, please try official OpenWRT mips SDK toolchain, that should contain gcc 7.4 or 7.0 and matching working assembler (binutils) package.

martin-frbg commented 4 years ago

Just for testing, could you try deleting the line with the "rwhdr" instruction in common_mips.h and/or the "#define RPCC_DEFINED" that accompanies it ? (Oh, and make that function "return 0" instead of "return ret" so that it does not read from uninitialized memory)

I believe this is only meant to read the cycle counter for profiling purposes so should not matter in normal operation. Perhaps this can even be a priviledged instruction not available to userspace in some circumstances ? At the very least, this would tell us if it is actually something the compiler generated itself (as I originally assumed due to the register number mismatch, but the %0 in the macro may be a gcc alias for the same thing) or code imported from this macro

martin-frbg commented 4 years ago
Read Hardware Register
RDHWR                       -  page 208
MIPS32® Architecture For Programmers Volume II: The MIPS32® Instruction Set, Revision 2.62
Copyright © 2001-2003,2005,2008-2009 MIPS Technologies Inc. All rights reserved.
Restrictions:
In implementations of Release 1 of the Architecture, this instruction resulted in a Reserved Instruction Exception.
Access to the specified hardware register is enabled if Coprocessor 0 is enabled, or if the 
corresponding bit is set in the HWREna register. If access is not allowed or the register is not implemented, a Reserved Instruction Exception is signaled.

So perhaps binutils assumes coprocessor not present/not available ? (the document does not spell it out whether this coprocessor is the fpu, but perhaps adding -mhard-float (if applicable) would help to make sure the fpu is online) Does it help when you add -march=24kc in addition to -mips32r2 ?

bhack commented 4 years ago

@martin-frbg I am on https://github.com/xianyi/OpenBLAS/pull/2565. What TARGET i need to set?

bhack commented 4 years ago

-mtune=24kc was already inherited but then setting TARGET with your PR I have -mtune was already specified, is now p5600.

martin-frbg commented 4 years ago

Can you try TARGET=24K please (this does not set any -mtune at the moment, so if mtune=24kc is inherited this should be best)

bhack commented 4 years ago

@martin-frbg Yes see my comment in your PR

martin-frbg commented 4 years ago

OK, so back to using TARGET=P5600 for now, can you hack Makefile.system to fix the mtune in addition to the mips revision number in the ifeq $(CORE), P5600 block ?

brada4 commented 4 years ago

After taking out -mnan=2008 it links right, no tests ran though

 OpenBLAS build complete. (BLAS CBLAS)

  OS               ... Linux             
  Architecture     ... mips               
  BINARY           ... 32bit                 
  C compiler       ... GCC  (cmd & version : gcc (OpenWrt GCC 7.4.0) 7.4.0)
  Library Name     ... libopenblas_p5600-r0.3.9.dev.a (Single-threading)  

Ill try R2, to answer question inregard gcc vs assembly. But that comes with time.

bhack commented 4 years ago

@martin-frbg With that

mipsel-openwrt-linux-musl-gcc -Os -pipe -mno-branch-likely -mips32r2 -mtune=24kc -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -msoft-float -mips16 -minterlink-mips16 -iremap/root/source/build_dir/target-mipsel_24kc_musl/OpenBLAS-0.3.9:OpenBLAS-0.3.9 -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -O2 -DMAX_STACK_ALLOC=2048 -Wall -mabi=32 -mips32r2  -DF_INTERFACE_GFORT -fPIC -DNO_LAPACK -DNO_LAPACKE -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=4 -DMAX_PARALLEL_NUMBER=1 -DVERSION=\"0.3.9.dev\" -DASMNAME=parameter -DASMFNAME=parameter_ -DNAME=parameter_ -DCNAME=parameter -DCHAR_NAME=\"parameter_\" -DCHAR_CNAME=\"parameter\" -DNO_AFFINITY -I../..

But same rdhwr $2,$30 issue.

martin-frbg commented 4 years ago

OK so the -mtune does not help in this regard. So what happens when you change the RPCC macro in common_mips.h, does the rdhwr issue go away then ?

martin-frbg commented 4 years ago

@brada4 I am losing track now - was this with the "official" mips toolchain and/or with any changes to the rpcc macro ?

brada4 commented 4 years ago

@martin-frbg https://downloads.openwrt.org/releases/19.07.2/targets/ath79/generic/ by the end of page , file with SDK in the name, its to run on normal linux.

I am building without any change in code. I never saw rdhwr issue with native compiler on openwrt router, and would speculate openwrt cross-compiler is working better.

bhack commented 4 years ago

@martin-frbg With that change it works fine. But had many linking fails like

/mipsel-openwrt-linux-musl/bin/ld: failed to merge target specific data of file ../libopenblas_p5600p-r0.3.9.dev.a(saxpy.o)

brada4 commented 4 years ago

Comparing processors mine is Kc , his is KEc, mine has dsp2, theirs 4x DSP1 but no DSP2

brada4 commented 4 years ago

@martin-frbg With that change it works fine. But had many linking fails like

/mipsel-openwrt-linux-musl/bin/ld: failed to merge target specific data of file ../libopenblas_p5600p-r0.3.9.dev.a(saxpy.o)

You have to remove nan=2008 to link with openwrt musl libc, e.g. like my experimantation goes.

ifeq ($(CORE), P5600)                      
# CCOMMON_OPT += -mnan=2008 -mips32r5 -mtune=p5600  $(MSA_FLAGS)
# CCOMMON_OPT += -mips32r5 -mtune=p5600  $(MSA_FLAGS)
CCOMMON_OPT += -mips32r2 -mtune=p5600  $(MSA_FLAGS)                                 
FCOMMON_OPT += -mips32r5 -mtune=p5600  $(MSA_FLAGS)                    
endif                                                                  
bhack commented 4 years ago

@brada4 Ok sorry I missed your previous comment now it is ok without the rdhwr

 OpenBLAS build complete. (BLAS CBLAS)

  OS               ... Linux             
  Architecture     ... mips               
  BINARY           ... 32bit                 
  C compiler       ... GCC  (cmd & version : mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 7.3.0 r7456-ddd04310cb) 7.3.0)
  Library Name     ... libopenblas_p5600p-r0.3.9.dev.a (Multi-threading; Max num-threads is 4)
bhack commented 4 years ago

That one was -mips32r2 -mtune=24kc

martin-frbg commented 4 years ago

@brada4 thanks. I doubt the version level of the DSP extensions plays a role here (though DSP2 seems to come with a few vector instructions?). So the primary result of this thread is that we might want to add an entry to the faq, something like "if you encounter compiler errors about RDHWR when compiling for mips, throw away your compiler and get an officially supported toolchain for your architecture instead" ?

brada4 commented 4 years ago

Onion Omega2+ is sort real hardware, just that they did something bad to upstream toolchain when stuffing it into docker..... Technically adequate statement, but something that goes easier with exploring first-time user would be better. Actually half of issues are of that kind all the time.

bhack commented 4 years ago

I've notfied them at https://github.com/OnionIoT/source/issues/22

brada4 commented 4 years ago

Please try this SDK in the meantime: https://downloads.openwrt.org/snapshots/targets/ramips/mt76x8/ Thats what I see in docker menuconfig part if you intend to do full build image

brada4 commented 4 years ago

@martin-frbg built with mips32r2 very well I dont know if it is too much fro test to make simple libm formula and try linking to it for initial test?

martin-frbg commented 4 years ago

Sorry I do not understand, simple libm formula for doing what ?

brada4 commented 4 years ago

Just link an object that certainly uses libm and gets marked with nan=2008 ABI, then would not link in the end.

bhack commented 4 years ago

@brada4 I am trying to compile with an updated openwrt (19.07.2) official toolkit. In the main time what do you suggest to use in the Openwrt openblas Makefile WIP. For cross-compilation how we could handle OpenBLAS TARGET inside Openwrt?