nix-community / emacs-overlay

Bleeding edge emacs overlay [maintainer=@adisbladis]
506 stars 166 forks source link

gccemacs: External packages' native-compiled code won't get stored #74

Closed leungbk closed 3 years ago

leungbk commented 4 years ago
with (import <nixpkgs> {
  overlays = [ (import (builtins.fetchTarball {
    url = https://github.com/nix-community/emacs-overlay/archive/master.tar.gz;
  })) ]; });

(emacsPackagesGen emacsGcc).emacsWithPackages (epkgs: (with epkgs; [
  evil
]))

When running nix-build with something like the above on recent revs of this overlay, I get numerous errors of this form:

In toplevel form:
evil-command-window.el:36:1: Error: Can't find a writable directory in `comp-eln-load-path'
Compiling /nix/store/w1jd99yn8dgdqxgim2mmxbcz3060vlii-emacs-evil-20201008.1515/share/emacs/site-lisp/elpa/evil-20201008.1515/evil-commands.el...
ad-handle-definition: `evil-mode' got redefined
tadfisher commented 3 years ago

For byte-compiling, we add the package's lisp directory using addToEmacsLoadPath. When compiling, we'll need to add $out/share/emacs/native-lisp to comp-eln-load-path as by default it will try to use ~/.emacs.d/eln-cache/, which won't work in the builder environment.

We'll also need to add "${profileDir}/share/emacs/native-lisp/" to comp-eln-load-path in site-start.el, similar to how we handle load-path.

@adisbladis You added native-comp support to the generic package builder here: https://github.com/NixOS/nixpkgs/commit/e89082346762164f5a9f7f3ebc3afa13fb964718. Does this sound correct?

collares commented 3 years ago

I updated to today's nixos-unstable and I am getting a bunch of compilation failures (for example, org-plus-contrib says Done (Total of 0 files compiled, 200 failed)). I see a bunch of

Debugger entered--Lisp error: (native-compiler-error-empty-byte "/nix/store/szl4z641xrf5zswk5fjwyi4hys1fkpr0-emacs-..." "/nix/store/szl4z641xrf5zswk5fjwyi4hys1fkpr0-emacs-...")
  signal(native-compiler-error-empty-byte ("/nix/store/szl4z641xrf5zswk5fjwyi4hys1fkpr0-emacs-..." "/nix/store/szl4z641xrf5zswk5fjwyi4hys1fkpr0-emacs-..."))
  comp--native-compile("/nix/store/szl4z641xrf5zswk5fjwyi4hys1fkpr0-emacs-...")
  batch-native-compile()
  command-line-1(("--eval=(add-to-list 'comp-eln-load-path \"/nix/stor..." "-f" "batch-native-compile" "/nix/store/szl4z641xrf5zswk5fjwyi4hys1fkpr0-emacs-..."))
  command-line()
  normal-top-level()

and many requires fail with a file-missing error. I use this overlay (tested with commit 48efc8f) as follows

pkgs.emacsWithPackagesFromUsePackage {
  config = builtins.readFile /home/collares/.emacs.d/init.el;
  package = pkgs.emacsGcc;
  alwaysEnsure = true;
  extraEmacsPackages = epkgs: [ epkgs.desktop-environment epkgs.org-plus-contrib ];
}

Thanks for the work on this, by the way!

collares commented 3 years ago

From a quick inspection of emacs' source code, I think directories in comp-eln-load-path must be slash-terminated. I do have those folders in my nix store, which seems to confirm that this is the case:

/nix/store/99y82a59pmck4hzszn79864rb66hk1dd-system-path/share/emacs/native-lisp28.0.50-x86_64-pc-linux-gnu-c50f2f5ede36309a94931f324ed27b53
/nix/store/akx3lh6h8vyj63xgi396mylbhn3lmc4p-emacs-gcc-20201217.0/share/emacs/native-lisp28.0.50-x86_64-pc-linux-gnu-c50f2f5ede36309a94931f324ed27b53

I will test adding a slash and edit this comment with the results.

collares commented 3 years ago

In decreasing order of importance, here's what's causing the above messages. First, the issues:

  1. The file-missing errors actually matter. When a file requires another file from the same directory, compilation fails because the current package's directory is not in load-path. In other words, setupHook is only called for dependencies of a package, not for the package itself, and that breaks compilation.
  2. Because the trailing slash is not missing in generic.nix, only site-start.eln gets written to the wrong place. However, it's probably the case that emacs cannot find the eln files afterwards. I will submit a small PR to fix this just in case.

The non-issues:

  1. The Done (Total of 0 files compiled, 200 failed) message comes from elpa2nix, which calls package-unpack. This shares the same root cause as issue 1 above, but does not cause native compilation failures per se. Incidentally, package.el has a package-native-compile variable, which I guess is useless for Nix.
  2. The native compiler is just noisy when the file contains nothing compilable (that is, if the byte-compiler would generate an empty .elc), I think. This causes the native-compiler-error-empty-byte error.
leungbk commented 3 years ago

I think the upstream patch should take care of this?

collares commented 3 years ago

Item 1 from https://github.com/nix-community/emacs-overlay/issues/74#issuecomment-751131321 is still pending, but that only affects files that require other files from the same directory.

tadfisher commented 3 years ago

Something about native-comp is breaking git-commit somehow. Compiling with emacsGcc or emacsPgtkGcc results in the following:

Log output for (emacsPackagesFor emacsPgtkGcc).git-commit
@nix { "action": "setPhase", "phase": "unpackPhase" }
unpacking sources
unpacking source archive /nix/store/6yrf5p0li2k7ajwyjncjzpisxwria7cr-source
source root is source
@nix { "action": "setPhase", "phase": "patchPhase" }
patching sources
@nix { "action": "setPhase", "phase": "configurePhase" }
configuring
no configure script, doing nothing
@nix { "action": "setPhase", "phase": "buildPhase" }
building
@nix { "action": "setPhase", "phase": "installPhase" }
installing
  INFO     Scraping files for git-commit-autoloads.el... 
  INFO     Scraping files for git-commit-autoloads.el...done
Checking /nix/store/5a8w97mi76d2kq109iy5g9pzcif3ddrk-emacs-git-commit-20210102.1242/share/emacs/site-lisp/elpa/git-commit-20210102.1242...
Compiling /nix/store/5a8w97mi76d2kq109iy5g9pzcif3ddrk-emacs-git-commit-20210102.1242/share/emacs/site-lisp/elpa/git-commit-20210102.1242/git-commit-autoloads.el...
Compiling /nix/store/5a8w97mi76d2kq109iy5g9pzcif3ddrk-emacs-git-commit-20210102.1242/share/emacs/site-lisp/elpa/git-commit-20210102.1242/git-commit-pkg.el...
Compiling /nix/store/5a8w97mi76d2kq109iy5g9pzcif3ddrk-emacs-git-commit-20210102.1242/share/emacs/site-lisp/elpa/git-commit-20210102.1242/git-commit.el...

In toplevel form:
git-commit.el:123:1: Error: Cannot find suitable directory for output in `comp-eln-load-path'
Done (Total of 0 files compiled, 1 failed, 2 skipped)
Debugger entered--Lisp error: (error "transient--init-suffix-key is already defined as s...")
  error("%s is already defined as something else than a gen..." transient--init-suffix-key)
  cl-generic-ensure-function(transient--init-suffix-key)
  cl-generic-define-method(transient--init-suffix-key nil ((obj transient-suffix)) nil #f(compiled-function (obj) #))
  byte-code("\300\301\302\303\302\304%\210\300\301\302\305\306\307%\207" [cl-generic-define-method transient--init-suffix-key nil ((obj transient-suffix)) #f(compiled-function (obj) #) ((obj transient-argument)) t #f(compiled-function (cl--cnm obj) #)] 6)
  require(transient)
  load-with-code-conversion("/nix/store/5a8w97mi76d2kq109iy5g9pzcif3ddrk-emacs-..." "/nix/store/5a8w97mi76d2kq109iy5g9pzcif3ddrk-emacs-..." nil t)
  git-commit-setup-check-buffer()
  run-hooks(find-file-hook)
  after-find-file(nil t)
  find-file-noselect-1(# "/build/packages/git-commit-20210102.1242.el" nil nil "/build/packages/git-commit-20210102.1242.el" (114162937 31))
  find-file-noselect("/build/packages/git-commit-20210102.1242.el")
  command-line-1(("-l" "/nix/store/4jj63z4v1xp13rh2md053dccq920hd45-elpa2n..." "-f" "elpa2nix-install-package" "/build/packages/git-commit-20210102.1242.el" "/nix/store/5a8w97mi76d2kq109iy5g9pzcif3ddrk-emacs-..."))
  command-line()
  normal-top-level()

Building (emacsPackagesFor emacsGit).git-commit doesn't result in the same issue. This happens with or without the change from https://github.com/NixOS/nixpkgs/pull/107777.

collares commented 3 years ago

I see this too, but it seems like a very recent regression. Emacs-overlay at 8bb502cca3b1dc3ed35d1ebeacdc92364a80997e (that is, commit 33b8ce865fcfd58538ae2d7c3fff04998fcd3330 from branch feature/native-comp), from five days ago, is working fine.

tadfisher commented 3 years ago

Unfortunately, pinning emacs-overlay to that revision will break on auctex as they changed distribution formats from .tar.lz to plain .tar.

collares commented 3 years ago

@tadfisher I've verified that compilation does not fail if I revert commit https://github.com/emacs-mirror/emacs/commit/7d7bfbf0346114b116e14a4338ea235d12674f13 then commit https://github.com/emacs-mirror/emacs/commit/9973019764250ac1f4d77a6b426cdd9c241151c5. If you want these changes already applied, https://github.com/collares/emacs/tree/revert-package-el-changes has them and https://github.com/collares/emacs-overlay/tree/revert-package-el-changes is pointed at my Emacs fork. I will file an Emacs bug as soon as I have a minimized example.

collares commented 3 years ago

This might not be a bug in upstream emacs, since pkgs/build-support/emacs/elpa2nix.el calls (package-initialize) explicitly and https://github.com/emacs-mirror/emacs/commit/9973019764250ac1f4d77a6b426cdd9c241151c5 replaces a bunch of (package-initialize) calls by (package-activate-all). Merely native-compiling the file, without the elpa2nix package installation, does not suffice to trigger the bug.

Indeed, if you take a look at the command line in the above stack trace, the error does not happen when batch-native-compiling, but rather when running elpa2nix-install-package, which makes it a bit of a surprise that there would be a difference of behavior between emacsGit and emacsGcc. It's not super surprising, though, because there's a warning about comp-eln-load-path suggesting that transient is being native-compiled when required (or, less likely given the error message, a native-compiled version of transient is being used). This last native compilation fails because it seems to be happening during package installation (where we don't set up the native load path), not when we batch-native-compile stuff explicitly, but it's unclear to me why native compilation would make a difference.

collares commented 3 years ago

I've debugged the git-commit.el problem. There are two parts to the problem.

  1. The git-commit.el:123:1: Error: Cannot find suitable directory for output in 'comp-eln-load-path' error when running elpa2nix-install-package. This one happened before, and it occurs because git-commit.el:123 ((require 'transient)) starts trampoline compilation (apparently because transient.el calls advice-add), which then errors out because the native load path is not yet set up when running elpa2nix-install-package. We should probably disable trampoline compilation here since we do it later anyway.
  2. https://github.com/emacs-mirror/emacs/commit/7d7bfbf0346114b116e14a4338ea235d12674f13 recently changed autoload behavior in some cases and this seems to have exposed a bug in native-comp both in emacsGit and emacsGcc. I have filed https://debbugs.gnu.org/cgi/bugreport.cgi?bug=45854 for this.

Point 1 should be easy to fix on our side, but the comp-enable-subr-trampolines is not fully respected due to an upstream bug. I've reported this above too and it should be fixed soon.

collares commented 3 years ago

As of yesterday, the workaround in https://github.com/NixOS/nixpkgs/pull/109370 fixes issue 1 and makes git-commit compile. Issue 2 is being investigated upstream. Issue 2 was found not to be specific to emacsGcc.

jhenahan commented 3 years ago

I have a branch off of 20.09 at https://github.com/jhenahan/nixpkgs/tree/emacs-fixes that includes your patches (and some tweaks to get them applying to 20.09) for anyone looking for something to use without having to maintain a nixpkgs checkout or rebuild the world.

collares commented 3 years ago

@jhenahan That's very useful, thanks!

By the way, at this very moment no packages are being native compiled in emacs-overlay (at least on my machine). This is a regression introduced in #105 and is being discussed there.

collares commented 3 years ago

The main work on the native-comp infrastructure (done by @tadfisher) is in nixpkgs master, and so are a couple of fixes I made. So I believe this issue will be resolved once the channels update.

If you open emacsGcc and see compilation happening, it probably means that that a package has a special build process which wasn't adapted to work with native compilation yet. In my machine, this happens only with mu4e.

@jhenahan Note that I went for an alternative solution for the trampoline problem, available in a separate PR linked above.