azyx3 / lz4

Automatically exported from code.google.com/p/lz4
0 stars 0 forks source link

forceinline on GCC and CLang #18

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
On gcc and clang, we can use:

#define forceinline __inline__ __attribute__((always_inline))
#define ensure_forceinline __attribute__((always_inline)) // inline or die

Original issue reported on code.google.com by baiyang@gmail.com on 12 Apr 2012 at 10:03

GoogleCodeExporter commented 9 years ago
Is that important ?
Specifically, does it improve performance, or modify the generated binary ?

Regards

Original comment by yann.col...@gmail.com on 3 May 2012 at 2:02

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
If it's not importent, why you add these lines to your source code:

#ifdef _MSC_VER
#define inline __forceinline ...

If you think forceinline in VC is important, why not in GCC?

Original comment by baiyang@gmail.com on 3 May 2012 at 2:08

GoogleCodeExporter commented 9 years ago
Well, ok, i understand, 
but that's not the main point here.

Without this define, the source code does not even compile in VC.

So it's true that i could have written
#define inline __inline
instead.

It just happen that i'm using this #define inline __forceinline generically 
across my source codes. I simply copy/pasted it. I have not really paid 
attention at the difference. I guess it does not really matter in 99% of time, 
and might result in a small difference in others.

The point is just that a #define is necessary for VC. So, while at it...

Original comment by yann.col...@gmail.com on 3 May 2012 at 10:37

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Ok, the GCC equivalent of VC's "__inline" is "__inline__". Any way, 'inline' is 
not a traditional "C Keyword" (C89), For maximum portability and performance, 
you may have the following defines:

#ifdef "VC or VC Like (e.g: ICL for windows, etc.)"
#    define yourinline __forceinline
#elif "GCC or GCC Like (e.g.: clang, etc.)"
#    define yourinline __inline__ __attribute__((always_inline))
#elif __cplusplus
#    define yourinline inline
... // other compilers ?
#else
#    define yourinline
#endif

Of couse, it may occupy more lines, but the readability of code doesn't only 
rely on the number of lines of the code. Contrarily, just write every thing 
within just one line of code cause it unreadable.

With well and consistent coding standard used, code could be very easy to read 
even if it has a bit of more lines, right? So keep it simple and clear is very 
important. But firstly, it not the whole world. And secondly, we could solve it 
by not only reduce the number of lines of the code, but also other methods 
(e.g: well coding standard and formatting, etc.).

Original comment by baiyang@gmail.com on 4 May 2012 at 4:19

GoogleCodeExporter commented 9 years ago
Isn't 'inline' C99 compliant ?

Original comment by yann.col...@gmail.com on 4 May 2012 at 7:01

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
1. As I have been sad, it's not only a standard compliant problem, but also a 
performance issue. For compute-intensive algorithm like lz4, even 1% 
performance boost are valuable.

2. There are lot of guys using a compiler which is NOT C99 compliant. 
Especially for some embedded enviroments. So I have been emphasized "C Keyword" 
as C89 :-)

Original comment by baiyang@gmail.com on 4 May 2012 at 7:10

GoogleCodeExporter commented 9 years ago
OK, these are 2 separate objectives :

Regarding performance : i'm not interested in making the generic code more 
complex for the perspective of maybe 1% speed improvement in some circumstances.
As has been said, if you feel this is a better fit for your needs, i invite you 
to fork the generic code. Such adaptation is in fact welcomed : the generic 
code cannot expect to be the end solution for all needs.

The second objective is more interesting, for Compilers of embedded 
environments.

A word on C89 vs C99 : my initial intention was to be C99 compliant only, but 
well, as you say, not all compilers have reached that level (even though we are 
in 2012!). The most important one which comes to mind being Visual. Surely 
other ones are on the list too.

I'm not going to be "totally sticking" to C89 either. 
As far as know, all "limited" compilers i'm aware of are at least C89 plus a 
good subset of C99, with only a few missing features which forbid it to be 
called "completely C99 compliant".
For example, i do like much the '//' mark for comments, which is C99, and i'm 
not aware of any compiler which does not support it. So I do not plan to get 
rid of it, since it is much too convenient for code reading and maintenance.

However, if it is possible to make LZ4 easier to use for "limited" (i.e. not 
C99) compilers, this is welcomed.

I don't think that introducing complex and GCC-only attributes is going into 
the right direction. Rather, a simpler and more generic route would be to 
remove 'inline' keyword entirely (as i did with "restrict" for example).

So it would give something like this :

#if __STDC_VERSION__ >= 199901L // C99
/* "restrict" is a known keyword */
#else
#  define restrict // Disable restrict
#  define inline   // Disable inline
#endif

The only problem i have with this modification is that i have receive no 
problem report regarding the use of 'inline' keyword up to now. This is like 
removing an imaginary problem that no one seems to experience. So is this 
change worth introducing ?

Original comment by yann.col...@gmail.com on 10 May 2012 at 12:11

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
En ... may be you are right, So you may delay this "fix" until some one have 
been encounter it in a real case.

Although I think it's realy common case for some embedded enviroments.

Original comment by baiyang@gmail.com on 10 May 2012 at 10:05

GoogleCodeExporter commented 9 years ago
Delayed or cancelled. Possible "inline" disabling if issue with compilers for 
embedded target really show up.

Original comment by yann.col...@gmail.com on 10 May 2012 at 11:38

GoogleCodeExporter commented 9 years ago
just to say I'm now getting this issue with inline using latest xcode 4.4 on 
both Apple LLVM complier 4.0 and LLVM GCC 4.2

Undefined symbols for architecture x86_64:
  "_LZ4_compress64kCtx", referenced from:
      _LZ4_compress_limitedOutput in lz4.o
  "_LZ4_compressCtx", referenced from:
      _LZ4_compress_limitedOutput in lz4.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Original comment by Chu.Joe....@gmail.com on 5 Aug 2012 at 1:29

GoogleCodeExporter commented 9 years ago
Could it be related to issue 29, currently opened ?

Original comment by yann.col...@gmail.com on 5 Aug 2012 at 9:18