MLton / mlton

The MLton repository
http://mlton.org
Other
960 stars 127 forks source link

MLton for PowerPC: any interest in having pre-built versions? #503

Open barracuda156 opened 1 year ago

barracuda156 commented 1 year ago

@MatthewFluet This is not really “an issue”, but a suggestion. If there is some interest in having pre-built MLton for macOS PowerPC (at least 32-bit), I can build the latest release of 2021 and submit binaries to you. The build process is transparent (everything is mlton portfile), Macports environment is assumed, though I could probably modify it to a static build, if that is necessary (not in Macports, we do not want that, since even 2007 version works fine with our gmp, but for this specific endeavor). Cannot guarantee it builds for Tiger, but Leopard and Snow Leopard will work.

MatthewFluet commented 1 year ago

Having ppc binaries would be fine, especially if adding them to the GitHub release artifacts makes them a stable target for various bootstrap files.

barracuda156 commented 1 year ago

@MatthewFluet Noted. Let me sort out remaining Intel builds, so that I can finalize a port, for now, and I can build PPC binaries.

P. S. On a side note, is there a non-ugly way to fix platform for Rosetta build? Build itself works fine, and MLton works (unlike, say, GHC or SBCL, which segfault on start in Rosetta), however current MLton build system does not guess the arch correctly, and that breaks compilation (I would expect it to detect Intel, however it actually chose PowerPC, but ppc64, which cannot possibly work in Rosetta, there are no ppc64 slices in the OS anymore). What I did to make it work was patching platform to define arch = powerpc. This will work as a local fix in Macports (for a combo of 10.6 + ppc), but is not great, and obviously completely non-portable.

This is not a matter of substantial importance at all, I just remembered I wanted to ask. It is a common problem with all ports relying on uname – it has no idea of emulating the arch. When host/target triples are supported, things work nicely.

MatthewFluet commented 1 year ago

If you know the target platform, then you can compile with make TARGET_ARCH=powerpc, which will bypass whatever ./bin/platform would determine.

The trouble with macOS is that there are lots of complicated defaulting rules, like if the host process is running in amd64, then cc defaults to generating amd64 code. We've tried to work around them at various times (and I thought that it was mostly working, look through old issues).

One complication in MLton with supporting target triples is that we need to execute (not just build) a target program to extract things like #define constants for inlining into SML code (see the gen/constants target of ./runtime/Makefile).

barracuda156 commented 6 months ago

@MatthewFluet Sorry for ridiculous delays. My build set-up on 10.5 is broken and I have no time to sort it out…

However here are compiled binaries for 10.6 ppc: http://macos-powerpc.org/packages/mlton (The same portfile from MacPorts master is used, but MacPorts do not have ppc buildbots at the moment.)

barracuda156 commented 2 months ago

@MatthewFluet I need your advice, since I got a strange problem. I finally got to building MLton on 10.4 to make a pre-built binary which will be usable across PowerPC systems. There was an irrelevant initial error with parsing Makefile, due to make of Tiger being archaic. Switching to gmake fixed it. Then the build failed on Out of memory. Unable to allocate 586,359,576 bytes.

I reproduced identical failure on 10.5 and 10.6 (native ppc). Now, what is weird is that I have literally the same version compiled and installed earlier on the same machine in 10.6. I cannot say anything for sure re 10.4–10.5, since this is the first attempt there, but at least on 10.6 the build worked earlier, and fails now.

Relevant log:

   Compile SML starting
      pre codegen starting
     parseAndElaborate starting
     parseAndElaborate finished in 42.83 + 25.18 (37% GC)
     deadCode starting
     deadCode finished in 0.14 + 0.00 (0% GC)
     defunctorize starting
     defunctorize finished in 2.39 + 1.73 (42% GC)
     xmlSimplify starting
        typeCheck starting
        typeCheck finished in 1.38 + 0.53 (28% GC)
        xmlShrink starting
        xmlShrink finished in 1.73 + 2.05 (54% GC)
        xmlSimplifyTypes starting
        xmlSimplifyTypes finished in 0.52 + 0.45 (46% GC)
        typeCheck starting
        typeCheck finished in 1.04 + 1.05 (50% GC)
     xmlSimplify finished in 4.67 + 4.09 (47% GC)
     monomorphise starting
     monomorphise finished in 5.65 + 2.59 (31% GC)
     sxmlSimplify starting
        typeCheck starting
        typeCheck finished in 2.54 + 0.00 (0% GC)
        sxmlShrink1 starting
        sxmlShrink1 finished in 3.88 + 2.31 (37% GC)
        implementSuffix starting
        implementSuffix finished in 0.12 + 0.00 (0% GC)
        sxmlShrink2 starting
        sxmlShrink2 finished in 3.05 + 0.00 (0% GC)
        implementExceptions starting
        implementExceptions finished in 0.24 + 0.00 (0% GC)
        sxmlShrink3 starting
        sxmlShrink3 finished in 3.20 + 0.00 (0% GC)
        polyvariance starting
        polyvariance finished in 10.25 + 4.75 (32% GC)
        typeCheck starting
        typeCheck finished in 1.65 + 0.00 (0% GC)
     sxmlSimplify finished in 24.93 + 7.06 (22% GC)
     closureConvert starting
        flow analysis starting
        flow analysis finished in 1.45 + 0.00 (0% GC)
        free variables starting
        free variables finished in 0.76 + 0.00 (0% GC)
        globalize starting
        globalize finished in 0.46 + 0.00 (0% GC)
        convert starting
        convert finished in 9.72 + 14.65 (60% GC)
     closureConvert finished in 13.12 + 14.65 (53% GC)
     ssaSimplify starting
        typeCheck starting
        typeCheck finished in 6.69 + 0.00 (0% GC)
        removeUnused1 starting
        removeUnused1 finished in 6.15 + 1.49 (19% GC)
        introduceLoops1 starting
        introduceLoops1 finished in 0.09 + 0.00 (0% GC)
        loopInvariant1 starting
        loopInvariant1 finished in 3.11 + 1.85 (37% GC)
        inlineLeaf1 starting
        inlineLeaf1 finished in 4.24 + 1.70 (29% GC)
        inlineLeaf2 starting
        inlineLeaf2 finished in 3.21 + 1.61 (33% GC)
        contify1 starting
        contify1 finished in 2.68 + 1.52 (36% GC)
        localFlatten1 starting
        localFlatten1 finished in 2.57 + 0.00 (0% GC)
        constantPropagation starting
        constantPropagation finished in 6.55 + 5.04 (43% GC)
        useless starting
        useless finished in 12.08 + 6.10 (34% GC)
        removeUnused2 starting
        removeUnused2 finished in 3.29 + 1.88 (36% GC)
        simplifyTypes starting
        simplifyTypes finished in 4.66 + 5.84 (56% GC)
        polyEqual starting
        polyEqual finished in 1.36 + 0.00 (0% GC)
        contify2 starting
        contify2 finished in 1.66 + 0.00 (0% GC)
        inlineNonRecursive starting
        inlineNonRecursive finished in 4.12 + 1.11 (21% GC)
        localFlatten2 starting
        localFlatten2 finished in 3.03 + 0.00 (0% GC)
        removeUnused3 starting
        removeUnused3 finished in 4.28 + 1.09 (20% GC)
        contify3 starting
        contify3 finished in 2.53 + 0.00 (0% GC)
        introduceLoops2 starting
        introduceLoops2 finished in 0.04 + 0.00 (0% GC)
        loopInvariant2 starting
        loopInvariant2 finished in 2.98 + 0.69 (19% GC)
        localRef starting
        localRef finished in 5.01 + 0.69 (12% GC)
        flatten starting
        flatten finished in 4.22 + 0.00 (0% GC)
        localFlatten3 starting
        localFlatten3 finished in 2.95 + 1.27 (30% GC)
        commonArg starting
        commonArg finished in 4.18 + 1.15 (22% GC)
        commonSubexp starting
        commonSubexp finished in 4.76 + 0.75 (14% GC)
        commonBlock starting
        commonBlock finished in 1.97 + 0.00 (0% GC)
        redundantTests starting
        redundantTests finished in 3.82 + 1.36 (26% GC)
        redundant starting
        redundant finished in 2.12 + 2.43 (53% GC)
        knownCase starting
        knownCase finished in 12.51 + 3.54 (22% GC)
        removeUnused4 starting
        removeUnused4 finished in 4.66 + 1.97 (30% GC)
        orderFunctions1 starting
        orderFunctions1 finished in 0.96 + 0.00 (0% GC)
        typeCheck starting
        typeCheck finished in 4.15 + 0.42 (9% GC)
     ssaSimplify finished in 126.63 + 43.50 (26% GC)
     toSsa2 starting
     toSsa2 finished in 3.23 + 1.40 (30% GC)
     ssa2Simplify starting
        typeCheck starting
        typeCheck finished in 4.57 + 1.01 (18% GC)
        deepFlatten starting
        deepFlatten finished in 12.37 + 16.50 (57% GC)
        refFlatten starting
        refFlatten finished in 7.07 + 2.17 (23% GC)
        removeUnused5 starting
        removeUnused5 finished in 5.97 + 2.72 (31% GC)
        orderFunctions2 starting
        orderFunctions2 finished in 0.76 + 0.63 (45% GC)
        typeCheck starting
        typeCheck finished in 4.41 + 0.19 (4% GC)
     ssa2Simplify finished in 35.15 + 23.22 (40% GC)
     backend starting
        toRssa starting
        toRssa finished in 5.71 + 1.29 (18% GC)
        rssaSimplify starting
           rssaShrink1 starting
           rssaShrink1 finished in 3.09 + 2.75 (47% GC)
           insertLimitChecks starting
           insertLimitChecks finished in 2.66 + 0.89 (25% GC)
           insertSignalChecks starting
           insertSignalChecks finished in 0.00 + 0.00 (nan% GC)
           implementHandlers starting
           implementHandlers finished in 0.31 + 0.14 (31% GC)
           rssaShrink2 starting
           rssaShrink2 finished in 3.30 + 9.79 (75% GC)
           implementProfiling starting
           implementProfiling finished in 0.00 + 0.00 (nan% GC)
           rssaOrderFunctions starting
           rssaOrderFunctions finished in 1.52 + 0.00 (0% GC)
        rssaSimplify finished in 11.38 + 13.57 (54% GC)
        toMachine starting
        toMachine finished in 50.46 + 8.09 (14% GC)
     backend finished in 67.55 + 22.95 (25% GC)
      pre codegen finished in 327.81 + 146.37 (31% GC)
      C code gen starting
Out of memory.  Unable to allocate 586,359,576 bytes.

make[2]: *** [mlton-compile] Error 1
make[2]: Leaving directory `/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_lang_mlton/mlton/work/mlton-475cf2b14993869711f1a93a15a9fa854b5126ed/mlton'
make[1]: *** [compiler] Error 2
make[1]: Leaving directory `/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_lang_mlton/mlton/work/mlton-475cf2b14993869711f1a93a15a9fa854b5126ed'
make: *** [all] Error 2

I could blame changes in toolchain, but it is hard to imagine how this could simultaneously affect three systems with three different toolchains (gcc7 on Tiger, gcc10 on Leopard, gcc13 on Snow Leopard). Bootstrap compiler, while very old, is the same which was used before.

Both disk space and RAM should be sufficient. Machine works normally.

Where to look?

P. S. For the record, these versions I have installed on 10.6:

The following ports are currently installed:
  mlton @2023.07.21_0 requested_variants='' platform='darwin 10' archs='ppc' date='2023-07-21T21:05:56+0800'
  mlton @20230721_0 requested_variants='' platform='darwin 10' archs='ppc' date='2023-07-22T20:51:09+0800'
  mlton @20230901_0 requested_variants='' platform='darwin 10' archs='ppc' date='2023-10-16T16:47:14+0800'
  mlton @20231123_0 requested_variants='' platform='darwin 10' archs='ppc' date='2023-12-19T08:37:37+0800'
  mlton @20240119_0 requested_variants='' platform='darwin 10' archs='ppc' date='2024-01-24T06:29:09+0800'
  mlton @20240519_1 (active) requested_variants='' platform='darwin 10' archs='ppc' date='2024-05-24T22:42:36+0800'

And currently I tried to build 20240519, specifically 475cf2b14993869711f1a93a15a9fa854b5126ed commit (and it was building earlier).

MatthewFluet commented 2 months ago

So, all the different PPC macOS versions are exhibiting the Out of memory? Is this happening on the initial compile (with an "old" MLton) or with the recompile (with the just built "new" MLton)? What's the command line that appears above the quoted compilation log? Does it have a ram-slop 0.9 or similar?

I would suggest compiling with OLD_MLTON_COMPILE_ARGS=gc-messages or MLTON_COMPILE_ARGS=gc-messages, though that will generate a lot of output; beware that if these are running on build nodes, that might exceed log limits.

It's curious that the Out of memory comes at the C code gen, which is mostly just writing strings to a file. It's keeping the final IL version of the program around (unlike a transformation pass that translates from one IL to IL and (briefly) has both versions in memory), so shouldn't need much additional memory.