skeeto / elfeed

An Emacs web feeds client
The Unlicense
1.48k stars 116 forks source link

After update to Emacs 26: elfeed-db-load: Wrong type argument: avl-tree- #340

Open Quintus opened 4 years ago

Quintus commented 4 years ago

Hi,

I upgraded my Debian 9 to Debian 10 this weekend, and with it came the upgrade from Emacs 25 to Emacs 26. Before I started elfeed, I of course updated it via MELPA-stable as well.

When I started elfeed, it told me it upgraded the DB to Emacs 26' format, which is nice. Now, however, when I run M-x elfeed, startup takes quite some time and then yields a behemoth of an error message into the *Messages* buffer, from which I reproduce the first and last parts here (the error message is only one very long line, actually):

elfeed-db-load: Wrong type argument: avl-tree-, [cl-struct-avl-tree- [[[[[[[[[[[[[[[[nil nil ... 0] [[nil nil ... 0] nil ... -1] ... 1] [[nil [nil nil ... 0] ... 1] [nil [nil nil ... 0] ... 1] ... 0] ... 0] [[[nil [nil nil ... 0] ... 1]

[ ... snip ...]

("community.beck.de" . "65596 at https://community.beck.de") 1] ("www.uni-muenster.de" . "http://www.uni-muenster.de/Jura.itm/hoeren/?p=14385") -1] nil nil 0] elfeed-db-compare]

What is this? Do I have to worry about data loss? I haven't tried fetching new articles since the update since I fear my database could get corrupted.

skeeto commented 4 years ago

Unfortunately, updating your package.el packages just before upgrading Emacs is essentially the worst case situation. The problem is that package.el byte-compilation is subtly but fundamentally flawed in at least two different ways. I'll go into some detail about that in a followup, but I'll start with instructions for how to get out of your predicament.

When upgrading to a new Emacs release, the safest thing to do is to completely exit Emacs, toss out your elpa/ directory, upgrade Emacs, then reinstall all your packages from scratch. That's is annoying, and often you can get away without doing it, but this is not one of those cases.

Your database is very likely corrupt, but fortunately this is easy to fix. The problem is that it was accessed via Elfeed compiled with Emacs 25 but run in Emacs 26. The Emacs devs made a major breaking change in the defstruct macro between these two releases. Elfeed thinks it's getting the new Emacs 26 behavior, but it's actually getting the Emacs 25 behavior because it's using the old Emacs 25 macro (expanded during compilation).

If you look in your .elfeed/ directory, you'll find an index.backup file (details are in 6f3bb59). This is a backup copy of your pre-upgraded database. Exit Emacs completely, delete all Elfeed packages from under elpa/, in .elfeed/ copy index.backup over index (backing out of the corrupted upgrade), and then reinstall + rebuild Elfeed under Emacs 26. When you next use Elfeed, it will do the upgrade again but this time with a correctly-built Elfeed.

skeeto commented 4 years ago

So what's wrong with package.el? There are two problems that are closely related, but have different solutions. I should just write an article about this, but that will require some more research. I don't know yet if or how other Emacs' package managers (el-get, straight.el, borg) solve this problem. My personal, custom-built Emacs package manager, gpkg, does solve the build problems I discuss below, but, for unrelated reasons, it's not ready for anyone else to use it.

The first problem is that byte-compiled files aren't really backwards compatible. The Emacs devs have been careful to ensure the byte-code itself is backwards compatible. (A little ironic, since the main reason you'd want this is so that you can continue using byte-compiled files when you lack the source to recompile — which is at odds with the community's commitment to Free Software.) However, macros are expanded at compile time and that expansion is baked into the byte-compiled file. If the macro is defined in the same source file, then this is a non-issue because it's always a matching definition.

The problem arises when the macro is external to the source file, such as cl-defstruct. If the file is compiled against one version of the external macro, but then "linked" against a different version, then it's subtly broken. In this case, Elfeed was compiled against Emacs 25 but late binding linked against Emacs 26, resulting in a mismatch.

The flaw is that package.el should be segregating byte-compiled files by Emacs release. Emacs itself does something similar by installing files under $PREFIX/share/emacs/$RELEASE/lisp. So package.el should have a directory structure like elpa/$RELEASE/ for byte-compiled packages. For ease of use, it should be able to automatically byte-compile a package from source for whatever release you're currently using. In Elfeed's case, that meant after upgrading to Emacs 26, Elfeed would have been recompiled by package.el before it was loaded.

There's also a more sinister problem that you hadn't run into here. Some care is taken to make Emacs' own macros backwards compatible as well. The old defstruct macro expansions still work in Emacs 26, albeit with subtly different behavior compared to the Emacs 26 expansion. However, no care is taken for forwards compatibility. If you install packages using Emacs 26, then start up Emacs 25, many packages will be horribly broken. If package.el segregated byte-compiled files by release, you could switch between major Emacs releases, backwards and forwards, with little risk. (I do this all the time with my personal package manager.)

The second issue is also related to macros, but with macros defined in other non-Emacs packages. Suppose you install package bar, which depends on a macro defined in package foo. Both get installed, and everything is fine. Later on, foo gets an update and the macro expansion changes. The interface is the same (no API break) but the expansion itself might be incompatible — e.g. it expands to a call to a private function that changed. Since bar's dependency was updated, bar should also be recompiled. But package.el will not do this, resulting in a broken bar.

It gets worse, though. Suppose both bar and foo have updates. This might seem like it accidentally fixes the above issue, but it only makes it the breakage more subtle. When bar gets recompiled, it still probably uses the old macro expansion. Why? Because it loads foo via require, and Emacs says "oh, I already have that loaded" and doesn't load the new definition. This problem occurs even (and especially) within a package when a macro is defined in one source file and used in another. Depending on the (arbitrary) build order, package.el may compile the new package source file against the old package source file.

The fundamental problem is that packages are compiled using the current Emacs' instance. The active Emacs' state leaks into the byte-compiled files, which sometimes leads to these breakages. Instead, package.el should be building packages in isolation using a clean Emacs' instance.

If package.el segregated byte-compiled files by release, built them in a clean Emacs' instance, and was more aggressive recompiling after updates, it would have something akin to reproducible builds. As it stands now, there's leakage from the current state of the world, including the user's configuration and on-going Emacs' state, into byte-compiled files. Usually this works out fine by chance in a worse-is-better sort of way, but not always.

Quintus commented 4 years ago

Unfortunately, updating your package.el packages just before upgrading Emacs is essentially the worst case situation.

Clarification: I updated the package.el packages after the emacs update. But elfeed did not receive any updates, it was current.

Thank you for this very elaborate answer. Luckily, I'm careful enough to not conduct OS upgrades without a backup, so I had the old pre-upgrade ~/.elfeed directory available anyway. After quitting emacs, I restored that one, recursively deleted all of ~/.emacs.d/elpa as recommended, re-installed all packages (including elfeed) and then executed M-x elfeed again. It upgraded the directory again, and this time there was no problem.

So everything works again now. Thanks for the instructions!

I don't close this bug as you appearently want to use it for dealing with the package system more thoroughly. Better close it yourself when you think it's appropriate.