open-watcom / open-watcom-v2

Open Watcom V2.0 - Source code repository, Wiki, Latest Binary build, Archived builds including all installers for download.
Other
942 stars 157 forks source link

wpp: static functions are stripped without warning if only referenced by inline assembly #1193

Open crazii opened 6 months ago

crazii commented 6 months ago

Code: test.cpp

#include <stdio.h>

static void testfunc()
{
    printf("test\n");
}

#define USE_TEST2 1
#if USE_TEST2
void test2();
#pragma aux test2 =\
"call testfunc"
#endif

int main()
{
#if USE_TEST2
    test2();
#endif
    return 0;
}

Compile: wpp output/test.cpp -w2 -fr -fo=output/test.obj

wdis output:

Module: /home/crazii/test.cpp
GROUP: 'DGROUP' CONST,CONST2,_DATA,_BSS

Segment: _TEXT BYTE USE16 0000000C bytes
0000                            main_:
0000  B8 02 00                          mov             ax,0x0002
0003  E8 00 00                          call            __STK
0006  E8 00 00                          call            void __near testfunc()
0009  31 C0                             xor             ax,ax
000B  C3                                ret

Routine Size: 12 bytes,    Routine Base: _TEXT + 0000

No disassembly errors

Segment: CONST BYTE USE16 00000000 bytes

Segment: CONST2 BYTE USE16 00000000 bytes

Segment: _DATA BYTE USE16 00000000 bytes

Segment: _BSS BYTE USE16 00000000 bytes

BSS Size: 0 bytes

And wlink also successfully generated a executable (this might also be a problem of wlink) but the executable will crash/freeze.

if #define USE_TEST2 0 then testfunc is not referenced and there's a warning in a normal way.

wcc works fine. Compared output of wcc:

Module: /home/crazii/test.cpp
GROUP: 'DGROUP' CONST,CONST2,_DATA

Segment: _TEXT BYTE USE16 0000001D bytes
0000                            testfunc_:
0000  B8 04 00                          mov             ax,0x0004
0003  E8 00 00                          call            __STK
0006  B8 00 00                          mov             ax,offset DGROUP:L$1
0009  50                                push            ax
000A  E8 00 00                          call            printf_
000D  83 C4 02                          add             sp,0x0002
0010  C3                                ret

Routine Size: 17 bytes,    Routine Base: _TEXT + 0000

0011                            main_:
0011  B8 02 00                          mov             ax,0x0002
0014  E8 00 00                          call            __STK
0017  E8 00 00                          call            testfunc_
001A  31 C0                             xor             ax,ax
001C  C3                                ret

Routine Size: 12 bytes,    Routine Base: _TEXT + 0011

No disassembly errors

Segment: CONST WORD USE16 00000006 bytes
0000                            L$1:
0000  74 65 73 74 0A 00                               test..

Segment: CONST2 WORD USE16 00000000 bytes

Segment: _DATA WORD USE16 00000000 bytes
jmalak commented 6 months ago

it is not a bug. It is problem with C++ name mangling. the function testfunc is C++ not C therefore its name is mangled to W?testfunc$n()v. use -a option for wdis to see mangled names. assembly function use unmangled symbol 'testfunc'. you need to change testfunc to C or use mangled name in assembly. The same problem is with assembly code in C and C++ when you use various calling convention because each use different name mangling. You are responsible to use correct name in in-line assembly call.

crazii commented 6 months ago

If it was name mangling, why it works when static is removed? It is confusing, because I didn't use extern "C", and the assembly calls the right mangled symbol name. Should be the assembly calls the same symbol in all cases? The behavior is inconsistent.

wdis -a test.obj when only static is removed:

.387
                PUBLIC  `W?testfunc$N()V`
                PUBLIC  main_
                EXTRN   __STK:BYTE
                EXTRN   printf_:BYTE
                EXTRN   ___wcpp_4_data_init_fs_root_:BYTE
                EXTRN   _small_code_:BYTE
                EXTRN   _cstart_:BYTE
DGROUP          GROUP   CONST,CONST2,_DATA,_BSS
_TEXT           SEGMENT BYTE PUBLIC USE16 'CODE'
                ASSUME CS:_TEXT, DS:DGROUP, SS:DGROUP
`W?testfunc$N()V`:
        mov             ax,4
        call            near ptr __STK
        mov             ax,offset DGROUP:L$1
        push            ax
        call            near ptr printf_
        add             sp,2
        ret
main_:
        mov             ax,2
        call            near ptr __STK
        call            near ptr `W?testfunc$N()V`
        xor             ax,ax
        ret
_TEXT           ENDS
CONST           SEGMENT BYTE PUBLIC USE16 'DATA'
L$1:
    DB  74H, 65H, 73H, 74H, 0aH, 0

CONST           ENDS
CONST2          SEGMENT BYTE PUBLIC USE16 'DATA'
CONST2          ENDS
_DATA           SEGMENT BYTE PUBLIC USE16 'DATA'
_DATA           ENDS
_BSS            SEGMENT BYTE PUBLIC USE16 'BSS'
_BSS            ENDS

And of most important, it's not about name mangling, it's about warnings. There's NO WARNING but the real mangled function is UNREFERENCED and silently removed. The behavior again is inconsistent.
Because (as I said in the original issue post) if you #define USE_TEST2 0 There's a warning about unreferenced function with -w2 option: There's a Level 2 warning W202 in the document: Symbol ’%s’ has been defined, but not referenced. The compiler shouldn't make it look like it's referenced but actually it's not, when user ask for a warning.

So what about helping with an error, or a warning? It took too much time to locate such problems because there's a unresolved symbol and program still runs.

i.e. GCC/BC would get an link error of unsolved externals, I know it's probably a mane mangling problem by an educated guess, so I added extern "C" and it works. But when using wpp I need to run the program to know there's a unknown problem and I have to check the disassembly carefully to find out the reason. in a similar way of GCC I think we can make OpenWatcom better.

jmalak commented 6 months ago

good info that if it is not static then it works. It looks like issue with symbols resolution (including namespace) and mangling. You are right that it should work for both cases the same way, but because we are in C++ world it is more complex then in C. This symbol can be invisible for some reason or code generator get wrong info from FE about symbol. The mangling in C or C++ is done on low-level by cooperation between CG and FE by callbacks. Anyway it requires more analysis. I got sample code from you that it helps me significantly. Note: how it works in other toolchains is irrelevant because concept of OW compiler FE, CG and in-line assembly is very different. In-line assembly in OW is black-box with interface description (pragma aux) and FE encapsulate this black-box into intermediate code and CG do final inclusion of this black-box into generated code. Important: #pragma aux in-line code is a template you need to instantiate it by including in FE language code (call it).

crazii commented 6 months ago

Fair enough, Let's talk about OW and focus on the problem itself. Do you mean that when #define USE_TEST2 0, the #pragma aux is not included in FE so there's a unreferenced warning. I was thinking about the same.

But the truth is if you remove the code of calling test2(); in main(), there's still no warnings (W202):

#include <stdio.h>

static void testfunc()
{
    printf("test\n");
}

#define USE_TEST2 1
#if USE_TEST2
void test2();
#pragma aux test2 =\
"call testfunc"
#endif

int main()
{
    return 0;
}

Output of wpp -w2 test.cpp -fr -fo=test.obj

Open Watcom C++ x86 16-bit Optimizing Compiler
Version 2.0 beta Dec  1 2023 02:00:11 (64-bit)
Copyright (c) 2002-2023 The Open Watcom Contributors. All Rights Reserved.
Portions Copyright (c) 1989-2002 Sybase, Inc. All Rights Reserved.
Source code is available under the Sybase Open Watcom Public License.
See https://github.com/open-watcom/open-watcom-v2#readme for details.
test.cpp: 122 lines, included 1166, no warnings, no errors
Code size: 9

A working W202 would help me get rid of the problem in compile time, otherwise I need to manually check every presence of the referenced static function.

crazii commented 6 months ago

Let me clarify the problem a little bit: With or without static of testfunc, the inline assembly calls to right mangled symbol, But with static, the body of testfunc is stripped.

If using the mangled symbol is expected, then the problem is the removal of the referenced function, and it's OK not to generate a W202 warning since it's not the real problem.

And when the #pragma aux is not instantiated, then it's problematic not giving a W202 warning I presume.

jmalak commented 6 months ago

I think I understand the problem that I will check compiler code why this happen.

crazii commented 6 months ago

Cool, thanks! it was just me that cleared my head afterwards.

jmalak commented 6 months ago

The C++ compiler doesn't report any unused static function, simply optimized it out. It requires to add warning if static function is not used at all.

crazii commented 5 months ago

But if you #define USE_TEST2 0, there is a warning. I believe the problem is related to inline assembly, when the static function get called/referenced by it.

jmalak commented 5 months ago

Sorry, I misunderstand. It looks like it is referenced from template not from instance of in-line code, therefore it is silent even if is not used.