Open frno7 opened 5 years ago
There are some additional workarounds for GCC >=5.3.0 in ps2sdk: https://github.com/ps2dev/ps2sdk/blob/master/iop/Rules.make#L33
Additionally, you can use -Q --help=optimizers
for ex. -O2
to see which flags are enabled by -O2
. You can then do a binary search to see which optimization breaks it, and disable that optimization (e.g. -fno-whatever
)
Oh, that’s great, @uyjulian! I fear that 100+ (or so) reboots are needed to find the error so anything that can cut that number down is very much welcome. A lot of stuff could have happened between GCC 5.3 and the presently used 10.2 too, but with a bit of guidance and luck, one might find it quickly.
Offtopic, but at ps2dev / ps2sdk we've recently migrated the EE compiler to gcc 10.2. There's some interesting patches that might be usefull for you as well (one of them is even yours ;): https://github.com/ps2dev/gcc/tree/ee-v10.2.0 https://github.com/ps2dev/binutils-gdb/tree/ee-v2.35.1
Note that we found a bug in sqrt.s in binutils. It's claimed to be a standard MIPSII instruction, but it's not. This was also something that worked with -O0, becouse then the function sqrtf() is called. But any optimization level higher would cause gcc to replace it with a faulty sqrt.s instruction.
Nice, the short loop bug fix by frno7 has a bit of background in the post MIPS: Fix GCC noreorder
for undefined R5900 short loops to the GCC patches mailing list. Some more work is needed to merge it. I’ve made a special r5900check tool that can find all occurrences of the short loop bug in general ELF files, including hand-coded assembly (of which the Linux kernel has some).
I’ve also made a write-up of the R5900 short loop erratum on the wiki, including porting the comment from the depracted GCC branch to GAS mainline (where it belongs), for handy reference in case anybody needs it explained. Likewise with the peculiarities of the R5900 floating point unit (FPU), etc. When I started out I found it difficult get information on this kind of stuff.
GCC and Binutils have a couple of more patches that are somewhat more relevant for Linux, like the new -mfix-r5900
option.
Have you submitted the SQRT fix to Binutils?
Incidentally, I fixed another instruction bug in m68k/Bintuils a week ago, in commit 3b288c8e2ed35
m68k: Require m68020up rather than m68000up for CHK.L instruction. :)
By the way, how’s it going with the R5900 MMI auto-vectorisation?
I fear that 100+ (or so) reboots are needed to find the error so anything that can cut that number down is very much welcome.
If the issue occurs with -O1
, you can binary search using the enabled flags from -O1
. Otherwise, you can remove -O1
flags from -O2
.
Binary search is faster than going through each of them one by one. Best case scenario is that you only have to try 7 times (for a total of 94 flags to search through).
Yes, I agree that’s a good plan, @uyjulian. :)
Have you submitted the SQRT fix to Binutils?
Not yet, but I plan to.
By the way, how’s it going with the R5900 MMI auto-vectorisation?
Not much further than last time we spoke. I think I'll finish the MMI support by disabling MIPS-MSA. Rewriting both MIPS-MSA and R5900-MMI into a more generic form is too much for me to do alone, so I will not be submitting this to gcc.
Back on topic: I've tried a simple hello world irx module using gcc9 (from @uyjulian ) and the ps2dev environment (code here). It also breaks when using anything other than -O0. The issue is in the macro's used for the import tables. It's a mixed bag of C structs with inline assembly: https://github.com/ps2dev/ps2sdk/blob/4a2d422a1824cd2374a41726884d9b470015aa2c/iop/kernel/include/irx.h#L55-L72 gcc will reorder these, causing the import tables to become invalid.
But iopmod does not seem to be using this. I've taken a closer look at how you're creating the import tables, and wow; it's an amazing piece of art. I can learn a thing or 2 from there. But as for the import tables, you seem to be using assembly only: https://github.com/frno7/iopmod/blob/master/include/iopmod/module-symbol.S So I don't think gcc is reordering those.
I've tried a simple hello world irx module using gcc9 (from @uyjulian ) and the ps2dev environment (code here). It also breaks when using anything other than -O0.
Which specific -f
optimization breaks it? If it's -ftoplevel-reorder
, I think that was already disabled when compiled imports.o
and exports.o
, so it may be something else.
@rickgaiser, I’m glad you found it inspirational! I made an effort to put as much logic as possible into the tooling, so that writing the actual IOP modules becomes effortless. This means that a module function declaration such as
serves quadruple-duty: the id_
macro is used to produce both import and export directives, automatically, and the rest of the declaration is used by the C compiler as usual. The standard IOP modules are declared in the same way, for example sifcmd
. The fourth duty is to display exports and imports, symbolically, in the iopmod-info
tool, used to inspect IRX module files.
I wrote a small C language lexer in tool/lexc.c
, which is actually complete, and a simple ELF parser in tool/elf.c
for the tooling.
The GNU linker option --gc-sections
ensures that only the relevant pieces of code are linked into the final module, which saves space without any manual effort. There is also a special automatic _module_init
which sets up the exports automatically, so modules don’t do that themselves. It then calls _module_start
, which is declared with module_init
in the module, as in for example
https://github.com/frno7/iopmod/blob/a5d2360a4f8a09f6fcf4dc58f6c87bcd1f926bc8/module/irqrelay.c#L305
Modules that don’t initialise anything don’t need to declare a module_init
at all, so that simplification works too. The module/printk.c
module for example only exports a function that prints in the Linux kernel log, similar to the kernel’s own printk
.
The module/irqrelay.c
module on the other hand sets up an RPC service, for relaying interrupts, so it needs the initialisation, but since it doesn’t export any functions its header file
is correspondingly simple, having only to declare the identification and the version of the module.
@rickgaiser, one more thing: The Linux kernel itself also understands IRX file module export and import directives, which means that the Linux kernel automatically prelinks any IOP modules from ROM and from files, as necessary, without manual effort, when an IOP module is requested with iop_module_request
:
/**
* iop_module_request - link requested IOP module unless it is already linked
* @name: name of requested module
* @version: requested version in BCD, where major must match with a least
* the same minor
* @arg: module arguments or %NULL
*
* Module library dependencies are resolved and prelinked as necessary. Module
* files are handled as firmware by the IOP module linker.
*
* IOP module link requests are only permitted if the major versions match
* and the version is at least of the same minor as the requested version.
*
* Context: mutex
* Return: 0 on success, otherwise a negative error number
*/
int iop_module_request(const char *name, int version, const char *arg)
@rickgaiser, for the fun of it I ported your Hello world application in commit 149740e86b3e. It should work like your own, and with PS2DEV. Here’s what the iopmod-info
tool says about it:
% tool/iopmod-info module/hello.irx
iopmod id addr 0xb0
iopmod id name hello
iopmod id version 0x0100
iopmod entry addr 0x0
iopmod unknown addr 0xd0
iopmod text size 172
iopmod data size 0
iopmod bss size 0
iopmod version 0x0100
iopmod name hello
import library 0x0102 stdio
import 0 4 printf stdio_printf
Notice that printf
is actually an (undeclared but linkable) alias for stdio_printf
, which in IOPMOD terms is considered to be its proper name by the declaration:
The disassembly with mipsr5900el-unknown-linux-gnu-objdump -dr module/hello.irx
reveals _module_init
that would do the exports, had there been anything to export, and its invocation of _module_start
:
00000000 <_module_init>:
0: 27bdffe8 addiu sp,sp,-24
4: afbf0014 sw ra,20(sp)
8: afa40018 sw a0,24(sp)
c: afa5001c sw a1,28(sp)
10: 8fa5001c lw a1,28(sp)
14: 8fa40018 lw a0,24(sp)
18: 3c020000 lui v0,0x0
18: R_MIPS_HI16 _module_start
1c: 24420038 addiu v0,v0,56
1c: R_MIPS_LO16 _module_start
20: 0040f809 jalr v0
24: 00000000 nop
28: 8fbf0014 lw ra,20(sp)
2c: 27bd0018 addiu sp,sp,24
30: 03e00008 jr ra
34: 00000000 nop
00000038 <_module_start>:
38: 27bdffe8 addiu sp,sp,-24
3c: afbf0014 sw ra,20(sp)
40: afa40018 sw a0,24(sp)
44: afa5001c sw a1,28(sp)
48: 3c020000 lui v0,0x0
48: R_MIPS_HI16 .rodata
4c: 244400c0 addiu a0,v0,192
4c: R_MIPS_LO16 .rodata
50: 3c020000 lui v0,0x0
50: R_MIPS_HI16 printf
54: 2442009c addiu v0,v0,156
54: R_MIPS_LO16 printf
58: 0040f809 jalr v0
5c: 00000000 nop
60: 24020001 li v0,1
64: 8fbf0014 lw ra,20(sp)
68: 27bd0018 addiu sp,sp,24
6c: 03e00008 jr ra
70: 00000000 nop
74: afa40000 sw a0,0(sp)
78: afa50004 sw a1,4(sp)
7c: 00001025 move v0,zero
80: 03e00008 jr ra
84: 00000000 nop
88: 41e00000 0x41e00000
8c: 00000000 nop
90: 00000102 srl zero,zero,0x4
94: 69647473 0x69647473
98: 0000006f 0x6f
0000009c <printf>:
9c: 03e00008 jr ra
a0: 24000004 li zero,4
...
https://github.com/frno7/iopmod/blob/13732bffa01dd29f3289a5a340a2ff08401557d3/Makefile#L16
It is presently unclear why IOP modules don’t quite work with the
-O2
compiler optimisation option.