dropbox / lepton

Lepton is a tool and file format for losslessly compressing JPEGs by an average of 22%.
https://blogs.dropbox.com/tech/2016/07/lepton-image-compression-saving-22-losslessly-from-images-at-15mbs/
Apache License 2.0
5.01k stars 354 forks source link

Build failures on OS X #11

Closed zmwangx closed 8 years ago

zmwangx commented 8 years ago

I am trying to submit lepton to Homebrew (version 1.0 with the header patch in d122450d846cacdc299e053f8059b751c08a1aae). It compiles fine locally on my Mid-2015 15'' rMBP with OS X 10.11.5, but fails on Homebrew's Jenkins servers on all three versions of OS X tested: 10.11, 10.10 and 10.9.

Specifically,

10.11 (build log):

/usr/local/Library/ENV/4.3/clang++    -I/tmp/lepton-20160715-76959-d9zmzc/lepton-1.0 -I/tmp/lepton-20160715-76959-d9zmzc/lepton-1.0/src/vp8/util -I/tmp/lepton-20160715-76959-d9zmzc/lepton-1.0/src/vp8/model -I/tmp/lepton-20160715-76959-d9zmzc/lepton-1.0/src/vp8/encoder -I/tmp/lepton-20160715-76959-d9zmzc/lepton-1.0/src/vp8/decoder  -std=c++11 -fno-exceptions -fno-rtti -DNDEBUG -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -mmacosx-version-min=10.11   -march=core-avx2   -o CMakeFiles/lepton-avx.dir/src/lepton/idct.cc.o -c /tmp/lepton-20160715-76959-d9zmzc/lepton-1.0/src/lepton/idct.cc
/tmp/lepton-20160715-76959-d9zmzc/lepton-1.0/src/lepton/idct.cc:228:15: error: always_inline function '_mm_mullo_epi32' requires target feature 'sse4.1', but would be inlined into function 'idct_sse' that is compiled without support for 'sse4.1'
        xv8 = _mm_mullo_epi32(_mm_set1_epi32(w7), _mm_add_epi32(xv4, xv5));
              ^
        ...

10.10 (build log) and 10.9 (build log):

In file included from /tmp/lepton-20160715-88309-10duq3n/lepton-1.0/src/lepton/jpgcoder.cc:64:
In file included from /tmp/lepton-20160715-88309-10duq3n/lepton-1.0/src/lepton/vp8_decoder.hh:4:
In file included from /tmp/lepton-20160715-88309-10duq3n/lepton-1.0/src/lepton/lepton_codec.hh:4:
In file included from /tmp/lepton-20160715-88309-10duq3n/lepton-1.0/src/vp8/model/model.hh:10:
In file included from /tmp/lepton-20160715-88309-10duq3n/lepton-1.0/src/vp8/model/numeric.hh:11:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/include/smmintrin.h:28:2: error: "SSE4.1 instruction set not enabled"
#error "SSE4.1 instruction set not enabled"

Also

/tmp/lepton-20160715-88309-10duq3n/lepton-1.0/src/vp8/model/numeric.hh:295:32: error: use of undeclared identifier '_mm_mullo_epi32'; did you mean '_mm_mullo_epi16'?
    __m128i t = _mm_srli_epi32(_mm_mullo_epi32(m, abs_num), log_max_numerator);
                               ^~~~~~~~~~~~~~~
                               _mm_mullo_epi16
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.0.2/include/emmintrin.h:727:1: note: '_mm_mullo_epi16' declared here
_mm_mullo_epi16(__m128i __a, __m128i __b)
^

etc.

Any idea what's going on here? Thanks.

P.S. The linked build logs contain info (OS, CPU, compiler, etc.) about build environments at the end.


Note: I removed some of my comments on this thread because they belong to a kind of pointless argument that skews the thread.

danielrh commented 8 years ago

It's probably not sufficiently cross compatible to compile with SSE4.1, or even, right now, on 32 bit machines. Is there an easy way I can test on these machines to iterate... Curious: is the recent change https://github.com/dropbox/lepton/commit/d122450d846cacdc299e053f8059b751c08a1aae causing issues? that change may have been introduced on an older mac to fix the build there, inadvertently breaking other platforms

zmwangx commented 8 years ago

Is there an easy way I can test on these machines to iterate...

Not that I know of unfortunately, need to repeatedly throw patches at the Homebrew PR and wait for the build servers.

Curious: is the recent change d122450 causing issues?

Without that header fix it won't compile on my local machine (OS X 10.11.5 with clang-703.0.31). (Undefined __m256i or something like that.)

UnitedMarsupials-zz commented 8 years ago

@zmwangx , your machine does not have SSE4 features, apparently, but lepton's code tries to use them unconditionally. I just added a FreeBSD port of lepton -- try applying the patch-cpu from the port. It removes the SSE4 requirement (SSSE3 is still a must, unfortunately).

UnitedMarsupials-zz commented 8 years ago

@danielrh, I think, it is a mistake to always include <smmintrin.h>. Here on FreeBSD, at least, one includes the <immintrin.h>, which then includes other headers depending on the features available:

#ifdef __MMX__
#include <mmintrin.h>
#endif

#ifdef __SSE__
#include <xmmintrin.h>
#endif
....

The heavily-used _mm_mullo_epi32 still would not be found, but, at least, the displayed error will point at your code, rather than a header bundled with the compiler :)

zmwangx commented 8 years ago

@UnitedMarsupials

I just added a FreeBSD of lepton -- try applying the patch-cpu from the port. It removes the SSE4 requirement (SSSE3 is still a must, unfortunately).

Thanks for the tip. We don't accept unofficial patches though, so maybe you could submit a PR and try to get it merged?

UnitedMarsupials-zz commented 8 years ago

We don't accept unofficial patches though

Sorry, who is the "we" in this context and what do I owe them?

UnitedMarsupials-zz commented 8 years ago

You don't owe anyone, but getting good patches merged is good for everyone.

I developed the patches. I published them -- and made the lepton-developer(s) aware of them. The ball is now in their court..

I then noticed this issue you raised, analyzed your problem and offered you a solution. In response, you said, my patches are insufficiently official for some mysterious group identified only as "we"...

I still don't know, who "we" are in your earlier comment. Nor would you explain, what makes my patches "unofficial" -- as far as the "we" are concerned.

Anyway, the patches are there, try them if you wish, ignore them if you don't... Happy hacking.

I don't understand how you saw hostility in my comment.

I saw confusion, and responded with a modicum of annoyance...

danielrh commented 8 years ago

The patches look cool If you can help by following the directions in the README (fill out contributor agreement, make pull requests) they can get merged into the master branch quite rapidly :-)

UnitedMarsupials-zz commented 8 years ago

@zmwangx, you are Homebrew? Never heard of it until today. You are welcome to my patches, but I'm not spending any more effort in convincing you.

so of course your patches are unofficial, at least at the moment

Suit yourself.

fill out contributor agreement

They are now part of the FreeBSD project, which means, they are under BSD license (the most recent 2-clause one). You should not need any special agreement with me.

BTW, maybe, some day, Dropbox will release their client software in regular Python -- so that one can run it on any OS, where Python already runs. That will be the day I start using Dropbox again.

make pull requests

Maybe, some day...

UnitedMarsupials-zz commented 8 years ago

maybe you could be a little bit more respective

Can't promise "respective" -- only helpful. You can fix your immediate problem without patching by telling the compiler to enable SSE4 instruction-set. You probably do that already on your own system, which is why your private build worked.

You either need to tell it to target a CPU with SSE4-features (such as with -march=nehalem or -march=westmere), or explicitly enable just SSE4.1 with -msse4.1. Obviously, the resulting binary will not work on CPUs without SSE4.

Now, if you hold your nose and try my patch, you can produce binaries without SSE4 instructions (though SSSE3 will still be required -- too much work to rewrite all that too). That binary will work on earlier CPUs (such as core2 or bonnell), but it will not attempt SSE4-instructions even on the CPUs with them. That's because lepton does not (currently?) offer run time CPU-detection -- the instruction-set(s) to use are determined at compile time...

So, if you want to produce the fastest binaries, you'll need to produce 2 or even 3 different lepton-packages for your users:

I now have the same trilemma with my FreeBSD port -- a person building from source using the port will have no problem, but a pre-built binary package will have an executable optimized for the building machine, not the running machine.

Although the code shows some signs of run-time detection being worked-on, Lepton maintainers still have quite a way to go :-)

On an unrelated note -- pardon me for straying off-topic -- if you are building on a system (such as MacOS), where OpenSSL and ZLib are already present, you'll want to forgo the entire dependencies/ subdirectory bundled with lepton and in their stead use -lcrypto for the MD5-implementation and ZLib for, well, ZLib. FreeBSD port does just that with a separate patch.

And finally on the matter of "unofficial" patches... You simply aren't doing your job of porting/packaging third-party software (such as lepton), if you refuse to add patches to your build, that the "upstream" developers (such as @danielrh) have not yet accepted. You are right to be cautious about such patches, but you can not rule them out completely.

Some of them may never be accepted. For example, @danielrh, who works for Dropbox, may not care for machines without SSE4-instructions. Evidently, Dropbox' servers are all sufficiently new, it is not a problem for them, so Dropbox may not want to pay Daniel to verify and maintain the alternative code. And, even if that's not true in general, obviously, his priority is going to be different from yours. It is not a show-stopper for him, but it is for you...

The project's developer(s) may also not like the solution a patch is offering -- and reject it without solving the original problem at all. They are fine with the problem remaining unsolved, but you are not -- so you either have to add your own patch, or adapt somebody else's "unofficial" one.

FreeBSD ports-collection is not a bad source of patches (and they are all BSD-licensed too!), but you may also examine pkgsrc, Debian, Gentoo, and other repositories...

Good luck.

danielrh commented 8 years ago

relax everyone... lets keep it nice positive here :-)

I'm very interested in the patches that make it work on older systems--unfortunately right now life is getting in the way and I won't have time to do the merges myself very soon.

If you get a pull request together and do the contributor agreement thing I can merge them in straight away on the GUI... otherwise I will have to wait until I get more than a few minutes on the side here and there... I'm preparing for a nice vacation and want to get all my i's dotted and t's crossed before I go.

But in any case we will merge them or equivalent patches--it's only a matter of when and who does the heavy lifting.

danielrh commented 8 years ago

I merged UnitedMarsupials BSD-licensed patch. Thanks 👍

I will close for now, but if the OSX port fails, feel free to reopen

zmwangx commented 8 years ago

Thanks, I'll run the CI tests later today.

zmwangx commented 8 years ago

@danielrh Sorry, I just tested latest master (acfb6ae95edec49a14b76ab35d0b6a976d9ae7c4) on our CI, and the errors are almost exactly the same AFAICT. See http://bot.brew.sh/job/Homebrew%20Core%20Pull%20Requests/4999/.

Actually I don't see the patches from https://svnweb.freebsd.org/ports/head/graphics/lepton/files/patch-cpu applied on master. Did I miss something?

By the way, I can't reopen the issue because it wasn't closed by me.