Merge asm and wasm triples?

kripken commented 8 years ago

Right now fastcomp uses the asmjs triple, and the new wasm backend in LLVM uses the wasm triple. For most purposes they are the same - both are simple 32-bit targets - but they also differ in some ways, such as the wasm triple having extended long double support (like x86), while the asmjs triple defines long doubles as doubles (like ARM).

It would be nice to use the same triple for everything. Then bitcode would be the same in both cases, and it could be compiled to asm.js or wasm at the last stage, without generating all new bitcode.

Things stopping this are:

Extended long double support in the wasm triple means that libc will include more code, every math function has a long double version, etc. This currently is an issue even without using long doubles, as printf can call it internally (but @sunfishcode suggests that might be optimized away). So switching the asmjs triple to wasm would increase code size and build times. We should measure this, and consider options given that.
Other differences may exist. For example, the asm.js target in fastcomp defines a cost for various operations, sets up SIMD to work properly, etc. wasm might not set those costs, so it might use defaults that differ, and it has no SIMD so I assume that isn't set up either. We'd need to make sure all this stuff does not regress when considering moving the asmjs triple to wasm.

sunfishcode commented 8 years ago

The TargetTransformInfo costs are now merged, as of r270508. We'll obviously still want to test that that the overall switch doesn't have significant regressions.

sunfishcode commented 8 years ago

As one data point, BananaBread with 128-bit long double is currently 2207933 bytes, without is 2191127 bytes. That's about a 0.7% difference.

kripken commented 8 years ago

Interesting.

What's the difference on hello world code size and build time?

juj commented 8 years ago

Having the same triple would be very good, even though it might give some size regression. Having the ability to compile same bitcode to both asm.js and wasm would be indeed beneficial.

Btw, how does wasm treat long doubles when executing? I suppose they'll just be doubles, or does wasm actually execute via the 80-bit fp87 stack? (that would be interesting from ARM perspective)

dschuff commented 8 years ago

What about the rest of the ABI; does asm.js have things like this-returning constructors, ARM-style guard variables and member functions, etc? I can see how for the near term it would be valuable for people wanting to compile both asm.js and wasm versions of their programs, but I'd hate to see us be stuck with a particular ABI forever just because it happened to match asm.js.

I guess for now you're really talking about using the actual asm.js triple and datalayout and all anyway? Does that have i64 as a "native" type in the datalayout?

sunfishcode commented 8 years ago

@juj Currently, the wasm target uses 128-bit long doubles, with software implementations of all the operators (in the compiler-rt library). This provides more useful functionality than just making long double the same as double, however it is slow and takes more code size. long double is rare in practice, so it often doesn't matter, but it does come up sometimes. Should we change wasm to use f64 for long double, or is asm.js ok with 128-bit long double?

I have a preference for keeping long double at 128 bits, and am optimistic that things like printf can eventually be written in a way that minimizes their use of the long double support code, however that's not what we have today, and there is a cost.

@dschuff I believe the discussion here includes changing the asm.js ABI to match the wasm ABI where it makes sense to do so. This-returning constructors would be good for asm.js in the same that they are for wasm, so we should keep that.

You raise an interesting issue with the "native" type set in the datalayout string. Asm.js doesn't have native i64, and wasm does. For the most part, the places where this impacts the ABI are handled below the LLVM IR level, so it's not a problem, however if LLVM's optimizer thinks i64 is a "native" type, it will try to optimize to use i64 more aggressively, which might be bad for asm.js. We should study this more.

kripken commented 8 years ago

I'd prefer to use f64 for long double. I just don't find 128 bit doubles compelling for the standard platform; let the extreme rare use cases use a software library, as this will be emulated in software anyhow. (I.e., if we had 128 bit hardware accelerated doubles in wasm, my opinion would be reversed.)

ABI-wise, in general emscripten has never made guarantees, which was kind of easy since almost everyone built entire projects to bitcode then to asm.js as a whole. (And for rare cases of people using our dynamic linking, they need to rebuild with the same emscripten version, no promises things will continue to work.) So mostly I don't think ABI is a concern here, aside from the perf issues @sunfishcode mentions.

However, one issue is i64 ffis: consider a function receiving an i64 that is called both internally and exported, then we would need to use two i32s due to JS limitations (so even internal calls would be a little sad). This might need to be an option in the backend, as not everyone cares about the JS limitation. IOW perhaps the ABI depends on the outside embedding?

juj commented 8 years ago

Ooh, software-implemented 128-bit arithmetic in compiler-rt? That sounds cool. I'd be in favor of having that in then, as long as we have good DCE elimination mechanism to ensure that we don't end up pulling all of that in into builds that don't need it. (e.g. does printf pull in much long double stuff from compiler-rt?) In my experience games don't much accidentally call to the *ll variants of the C runtime.

Although, what happens with long double on Clang/LLVM when compiling native executables? Does it always turn to software-emulated 128-bit arithmetic? If it does, then I like us following suit. If it doesn't, then we might want to reconsider as well.

Are there any LLVM/Clang compiler flags that affect the behavior of long double? (or do those always affect the used triple as well?)

sunfishcode commented 8 years ago

@kripken The hello world -O0 code size difference is currently 31712 bytes with vs 24593 bytes without.

@juj We have DCE, however printf does end up requiring several kilobytes of support code and it's not easy to avoid that.

On native targets, long double varies. On x86/x64, long double is the 80-bit extended-precision type. On other platforms, long double is 128-bit software-emulated. On others, it's 64-bit. Often, it depends on the OS as well as the hardware.

Some targets support the flags -mlong-double-64, -mlong-double-80, and -mlong-double-128 to select what long double is to be. However, these are ABI-breaking flags, so the whole program must be compiled the same way. That might be acceptable for asm.js/wasm for now, though it will be less practical with dynamic libraries.

kripken commented 8 years ago

@sunfishcode: thanks for the numbers on hello world. That worries me. While hello world isn't that important by itself, it means a potentially big increase in code size for small programs. It also means increased build times for those programs, which could be significant on the bots where we run tons of small programs.

The bottom line is that including a few K of code that will almost never be used sounds like a bad idea. Imagine if every CSS file needed a few K of boilerplate that did nothing. The smaller wasm programs are, the faster wasm adoption will be.

dschuff commented 8 years ago

Because of the build breakage we found in https://github.com/kripken/emscripten/pull/4501#discussion-diff-76284751 I was looking into musl's support for 128-bit long doubles. From what I can judge based on the commit history, ld128 support is sort of best-effort and not tested yet. Aside from the binary size issues mentioned above, we also have some known brokenness in functions like printf and scanf that seem suspiciously like they might be related to musl's use of long double as well. There has been some updates for long double since we last updated our musl, and bringing them in might help. But I don't particularly want to be musl's guinea pig and I'm starting to think that it's not worth it to have fp128 right now when we have so many more important things. I may shortly try flipping long double back to double and seeing if some of that breakage goes away.

dschuff commented 5 years ago

We kept long doubles, merged some of the ABI types and mangled names in libraries such as libc, but decided not to do a full merging of the bitcode ABI or triple.

emscripten-core / emscripten

Merge asm and wasm triples? #4340