Closed GoogleCodeExporter closed 8 years ago
[deleted comment]
To be clear, the breakage is on line 237 where you change
LZ4_STREAMDECODESIZE_U32 from 4 to 8 bytes, changing the size of
LZ4_streamDecode_t.
Original comment by d...@falconindy.com
on 28 Nov 2014 at 7:14
Indeed,
the internal content of state LZ4_streamDecode_t has changed,
in order to support some new scenario
(namely, short distance ring buffers).
As a consequence, it requires to track more values,
hence to increase the size of state.
Sorry about inconvenience.
Note that the library version has been upgraded from 1.3.1 to 1.4.0.
Consequently, the new soname is liblz4.so.1.4.0.
Systems can still host both versions concurrently liblz4.so.1.3.1 and
liblz4.so.1.4.0. But I suspect your program uses instead the symbolic link
"liblz4.so" or "liblz4.so.1" which probably tracks latest version.
Original comment by yann.col...@gmail.com
on 28 Nov 2014 at 10:11
No, the soname is not the filename.
$ readelf -d /usr/lib/liblz4.so | grep SONAME
0x000000000000000e (SONAME) Library soname: [liblz4.so.1]
If you change the ABI or API (not simply extend), you *must* bump the soname
version. r124 should have produced liblz4.so.2.
Original comment by d...@falconindy.com
on 28 Nov 2014 at 10:53
OK, I guess it comes from this line in the Makefile :
SONAME_FLAGS = -Wl,-soname=liblz4.$(SHARED_EXT).$(LIBVER_MAJOR)
So only LIBVER_MAJOR is part of the soname.
I don't know much the logic behind soname. So I'm not sure if it's normal that
only LIBVER_MAJOR is part of the name.
But it looks strange if every minor change requires a major numbering update.
What the point of having a 3-parts numbering system then ? Is that just to
track comment changes ?
Original comment by yann.col...@gmail.com
on 28 Nov 2014 at 11:14
I don't have a better answer than simply: this is the way things are. It would
be atypical that the soname contains more than a single number to version it.
The tag really is quite arbitrary, and so long as you never reuse a tag to
refer to 2 different ABIs, you can name it whatever you want.
https://autotools.io/libtool/version.html
This site is specific to libtool/autotools, but the versioning recommendations
(especially section 4.1) apply to any shared library released into the wild.
Original comment by d...@falconindy.com
on 28 Nov 2014 at 11:31
Thanks for the link. It's pretty clearly described.
Original comment by yann.col...@gmail.com
on 28 Nov 2014 at 11:41
Please note that the "raw" sonames and the libtool version-info are not a
match, the soname is built out of the version-info, so you should not apply the
rules from my guide to non-libtool usage.
The three-parts version system is a bit mental, to be honest — I would
suggest you keep it simpler as a two-components version, and bump the major
(soname) version if you change ABI, and the minor (non-soname) if you extend it
(add functions).
Note that you *can* change the ABI without changing the API (e.g. by changing
an int32_t parameter to int64_t), as well as you can change the API without
changing the ABI (more involved, requires the use of symbol versioning which is
mostly GNU-specific.)
https://blog.flameeyes.eu/tag/abi for reference.
Original comment by flameeyes@flameeyes.eu
on 3 Dec 2014 at 7:43
Yeah, I agree and definitely use the 3 parts version-info the way you describe.
To be fair, I wasn't expecting the small change introduced into r124 to result
into an ABI break. That's a lesson.
But to be more complete, there are more points at stake :
- The affected part of the API is within a section labelled "experimental",
which basically poses the question of how to introduce new features which are
not yet "completely settled". It's unreasonable to expect the same stability
level from new features compared to other ones supported since many years.
If tagging a section of the API "experimental" is not good enough, then how to
do it ?
- The issue comes from using types which are statically defined into the .h.
Hence they can become out of synch with the library.
But to fair, I wasn't expecting these type definitions to be used in tandem
with a shared library strategy. I was explained not so long ago that everything
within a shared library should be accessible without relying on a *.h file.
Hence, there are 2 functions, which create and free memory for the state
object. These functions are "by design" synchronized with the library, and
therefore safer to use.
The statically defined type was intended to be used with static library in mind.
For dynamic library, the constructor/destructor functions are supposed to be
used instead, because this method is more stable in relation with version
management.
How to achieve that ? Is that a reasonable objective ?
I suspect that adding comments within the header file will not be enough.
Original comment by yann.col...@gmail.com
on 3 Dec 2014 at 11:37
Ah, and I should add :
I'm heavily using Continuous Integration tests with each minor release of LZ4 :
https://travis-ci.org/Cyan4973/lz4#
It's an invaluable tool and works very well.
The number of tests run is now pretty large and cover a large palette of
situations.
However, there is no test related to Dynamic Library usage.
Hence there is no test able to spot an ABI breaking change.
It would be great too if some new tests could be added to the CI to keep an
automated eye on such issues.
Original comment by yann.col...@gmail.com
on 3 Dec 2014 at 11:45
What you can do is make sure that the type is opaque, by not defining it at all
in the header file; as long as your API calls are only accepting a pointer, you
don't have to define the structure publicly at all (just say "struct foobar;",
then you can use "struct foobar *" as you need).
If you define the structure in the header file, there is no hope to just say
"don't allocate it statically", somebody will.
Original comment by flameeyes@flameeyes.eu
on 3 Dec 2014 at 8:42
Well, to be clear, the intention is :
allocating the structure is a valid operation
(actually, some embedded systems do require such a behavior)
*when* the source code is compiled directly within the program.
In such case, there is no risk of mismatching versions.
However when the library is compiled as a shared DLL,
it is no longer recommended, due to risks of different versions.
Better use the constructor functions.
Basically, they look like 2 incompatible objectives.
The only solution that came to my mind is
having 2 separate *.h header files,
one for direct compilation/integration,
and one for DLL.
But it looks hackish.
Furthermore, I'm not aware of any other project having done that before.
Original comment by yann.col...@gmail.com
on 3 Dec 2014 at 8:50
Many projects use the idea of an opaque context object. It's a reliable method
of creating an API which can be evolved while maintaining backwards
compatibility. It doesn't matter what the true size of the context struct is
because it's self-contained within the lib. You're right to want to avoid
having multiple header files -- this is not a solution you will be happy with
in the long term.
Could you explain what you mean about allocating the structure being a valid
operation? Why would allocating the struct on the heap be a problem? Other code
inside lz4 allocates memory on the heap. What's the difference?
Original comment by d...@falconindy.com
on 3 Dec 2014 at 9:05
For me, and for PC environment in general, I feel there is no problem with that.
Such requests come from embedded environment, which can be very different.
I remember having met in the past several programmers which were suspicious of
standard C libs, let alone malloc(). So I wasn't surprised when I received
several requests to allow a way for "custom allocation methods" and even
"statically allocated state". Of course within such environments, the concept
of DLL doesn't even exist.
The first solution proposed was to provide the size of structure to allocate.
Hence function such as :
int LZ4_sizeofState(void);
It works, but it's still not completely fine, because it doesn't help static
allocation scenarios, and there is also a need to ensure right alignment. Sure,
malloc() is safe regarding alignment, but a "custom function" may be not.
"unalignment" costs little penalty on x86 environment, but for embedded
systems, it can be catastrophic (CPU exception, program crash).
Putting an advise in the comment is better than nothing, but still not really
automated enough.
Hence the second idea.
The size to allocate is expressed as a table of int32 or int64.
It will force the compiler to ensure alignment.
Note that current structures are already partially opaque : only the size is
accessible, for allocation purpose. The content remains hidden, and can change.
This trade-off was looking good so far.
Well, that is, before this new issue appeared : different versions, one used to
compile the program, and another one (DLL) present on the target system, with
difference table sizes as a consequence.
As I said, this "table" methodology was primarily intended for static
allocation within embedded environment. More powerful systems, those using DLL,
should really better use the constructor methods.
Except I can't force them to do so if a table structure is also present in the
header.
Plus, to be fair, static allocation has some merits, it's simply more
convenient to use in many circumstances (no malloc() also means no free(), less
complexity and less risks of memory leaks).
Another tentative idea to reduce occurrence of such situation is to "cheat" a
little on the real size of the tables, by declaring them larger than they
really need to be. This way, it would provide a bit of room for potential
future evolution, limiting the risk to introduce breaking changes.
This is all very prospective for me.
It looks difficult to please both worlds (embedded, and DLL) simultaneously.
Hence the suggestion of 2 different API.
But I agree, such solution doesn't look good from a maintenance perspective...
last minute idea : maybe a kind of "header extension for embedded environment" ?
Original comment by yann.col...@gmail.com
on 3 Dec 2014 at 9:45
The more I'm thinking about it, the more I'm suspecting that having 2 separate
headers, for Dynamic and Static linking, might be the better way to go.
Basically, the "normal" header would remain :
lz4.h
and be targeted at "DLL" API.
A more complete
lz4_static.h
would then be created, which includes "lz4.h", and also adds definitions only
valid for static linking, such as macros and structure sizes.
The last point which I'm worried about is that
advantages of strongly typed arguments might be lost in the process.
Structures with undefined size will have to be casted as void*.
Even a typedef won't help the compiler making a difference between 2 different
undefined structures, since they all must be :
typedef void* structPtrX;
So an error such as structPtr1 instead of structPtr2 won't be noticed.
Finally, something I'm wondering : differences between API & ABI.
Suppose that I remove macros from API.
It obviously impact API, but I suspect it doesn't change ABI.
Is that correct ?
I'm asking because soname seems tied to ABI changes.
Original comment by yann.col...@gmail.com
on 8 Dec 2014 at 1:06
You don't need to typedef everything to void. Your .c file can define a struct,
say 'struct foo_t', include the public header, and your public header can
define 'typedef struct foo_t foo_t'. Consumers of the API won't have the
private definition of the struct's members, so they'll only be able to define
pointers to foo_t, relying on library calls to perform allocation. The library
still knows the real size of foo_t.
You're correct that soname is tied to the ABI. Any time the ABI changes, so
must the soname. And as you've discovered, it's possible to change API without
affecting ABI. It's also possible to change ABI without affecting API, for
example, by reordering members within a struct.
I notice r125 was pushed and the build still produces liblz4.so.1 =(
Original comment by d...@falconindy.com
on 13 Dec 2014 at 11:15
The current issue is caused by a minor change in an experimental section of the
API which affected only one user so far. The said user is no longer affected,
both because it updated its reference lib (*.h) and moved to lz4frame API. So I
believe changing soname at this stage is more likely to generate confusion than
help.
I prefer keeping soname bump for a more drastic API change, that may happen
sometimes in 2016 (or not).
Original comment by yann.col...@gmail.com
on 24 Dec 2014 at 2:24
That's a truly disappointing and disheartening response. I can't take lz4
seriously as a project if this is going to be your approach to maintaining it.
This bug caused broken binaries in Arch Linux for thousands of users. To claim
that only "one" user was affected is gravely understating the effect.
Original comment by d...@falconindy.com
on 24 Dec 2014 at 3:19
Indeed, this is disappointing.
There are three reasons why WONTFIX resolution is wrong:
- there's no way to know how many programs were compiled with liblz4.so.1. Many
Linux distributions provide liblz4 as a shared library, and certainly there are
users out there linking their own programs with it. (I am, for one).
- tagging something as "experimental" in the documentation simply does not
work. For example the discussions about CONFIG_EXPERIMENTAL in Kernel config,
which is enabled by all distributions.
- there's no need to "conserve" soname numbers. They are not visible like
project version numbers. The right thing is to bump the number, even if the
incompatible change was detected afterwards. This will cause rebuilds to
happen, and the potential crashes will be avoided, even if people are not aware
of the issue yet, or haven't diagnosed it as caused by the liblz4 change. This
*is* the cleanest solution.
Original comment by zbys...@in.waw.pl
on 24 Dec 2014 at 3:48
Although we have not yet packaged r124, this will also cause problems on Gentoo
Linux, where all of our users build from source.
Please update the soname for ABI changes.
Original comment by floppymaster@gmail.com
on 24 Dec 2014 at 6:54
this bug with inconsistent .so as caused to systemd [1] to questioned the usage
of lz4 as a valid option as show in the subsequent message "o the question is
whether to replace lz4 with something more stable or to ignore the issue and
hope it doesn't happen again." also that question the stability of lz4.
[1]
http://lists.freedesktop.org/archives/systemd-devel/2015-February/028803.html
Original comment by prflr88@gmail.com
on 26 Feb 2015 at 5:25
It's probably a good idea to open this discussion again.
There's a need to keep a constructive level, seeking ways to improve situation
for the future. Blame game doesn't move us anywhere.
Zbigniew, contributed to this thread, and provided some interesting comments.
They can be a good reference for discussion.
The second comment (tagging something as "experimental" in the documentation
simply does not work) is indeed very relevant.
As a consequence, I decided to modify the way library is shipped.
There are now 2 different types of header files.
For an example, look at lz4frame.
There is an "lz4frame.h", which is the normal header, including prototypes
considered stable and supported. This header is installed within "include"
directory as part of the library installation process.
Another file, "lz4frame_static.h" is also proposed. It contains definitions
which can only be used **in the context of static linking**. As a consequence,
it's not installed within "include" directory. The file also clearly states "do
not use with a dynamic library".
Its objective is to publish definitions which can still evolve in a potentially
breaking way in future versions. They are therefore not stable enough to use
with multiple versions of the dynamic library. However, for a static linking
strategy, it's not an important consideration, and the advanced possibilities
it offers could be useful for some specific targets.
A difficult line must be drawn between "future proof API", which are best
supported when remaining "high level" and simple enough.
On the other hand, "very fine-grained API for advanced user" can be necessary
typically to support embedded environments. They have much stricter and
specific requirements.
These 2 set of requirements are almost opposite.
For example, embedded developper is typically directly in charge of memory
allocation, using static methodology. Also noteworthy, even standard C library
support might be very limited (such as malloc).
As a consequence, the number of "gears" (prototype) required for the developper
to take in charge all these parameters and produce an environment-compatible
program is much larger.
And obviously, more numerous and finer prototypes translates into increasing
chances of breaking changes.
It's not necessarily a bad thing, refusing changes is like refusing learning
and improvements.
But there should be a way to allow such kind of flexibility on new,
experimental and highly targeted prototypes,
without jeopardizing the whole numbering scheme, which objective is more in
line with the vast majority of users using stable prototypes, and which do not
care about such advanced functions.
With the new separation between "*.h" and "*-static.h", I hope to offer this
kind of flexibility (now located into *-static.h, targeting new usages or
embedded environments), while offering stability (now located into *.h) for
common environments using dynamic linking, and potentially facing multiple
library versions.
Note, by the way, that the part of API which used to be tagged "experimental"
and caused unfortunate troubles for systemd
is no longer tagged so. It is also included into lz4.h, meaning it is
considered stable from now on. A major number up will be granted in case of any
interface-breaking changes (none of which is planned for the time being).
Regards
Original comment by yann.col...@gmail.com
on 26 Feb 2015 at 2:42
Original issue reported on code.google.com by
d...@falconindy.com
on 28 Nov 2014 at 6:34