RoliSoft / WSL-Distribution-Switcher

Scripts to replace the distribution behind Windows Subsystem for Linux with any other Linux distribution published on Docker Hub.
MIT License
1.68k stars 182 forks source link

dnf on fedora 25 results in a broken rpm database #22

Open pheitman opened 7 years ago

pheitman commented 7 years ago

I don't know what the issue is, but installing fedora 25 and then doing a dnf update results in the rpm database being broken and trying to repair it failed.

I doubt that the issue has anything to do with wsl-distribution-switcher, but wanted to note if in case anyone else ran in to this or anyone has a workaround

RoliSoft commented 7 years ago

@pheitman @gbraad @thnuder @ikky888 This issue is also present on Rawhide. After some investigation, I've found that you can fix the database:

cd /var/lib/rpm
shopt -s extglob; rm -f !(Packages)
mv Packages{,.orig}
/usr/lib/rpm/rpmdb_dump Packages.orig | /usr/lib/rpm/rpmdb_load Packages
rpm -v --rebuilddb

However, it will not be really useful, since after the installation of the first package, it will hang and break the database. I was unable to run gdb or strace under WSL, so I had to use procmon from Windows to see what it is doing. In all case, it reads /var/lib/rpm/Packages, then creates the files until __db.003, after which it just hangs.

Since DNF is written in Python, I ran python3 -m trace --trace /usr/bin/dnf-3 install gdb and these are the relevant lines, right before it hangs:

  Installing  : libgcc-7.0.1-0.10.fc26.x86_64        [...]        1/205
 util.py(285):             out.flush()
 --- modulename: i18n, funcname: __getattr__
i18n.py(56):         return getattr(self.stream, name)
output.py(2013):                 self.lastmsg = msg
output.py(2014):             if ti_done == ti_total:
output.py(2015):                 print(" ")
 --- modulename: i18n, funcname: write
i18n.py(41):         if not isinstance(s, str):
i18n.py(44):         try:
i18n.py(45):             self.stream.write(s)
  --- modulename: i18n, funcname: write
i18n.py(41):         if not isinstance(s, str):
i18n.py(44):         try:
i18n.py(45):             self.stream.write(s)

rpmtrans.py(489):         for display in self.displays:
 --- modulename: rpmtrans, funcname: callback
rpmtrans.py(403):         if isinstance(key, str):
rpmtrans.py(405):         if what == rpm.RPMCALLBACK_TRANS_START:
rpmtrans.py(407):         elif what == rpm.RPMCALLBACK_TRANS_STOP:
rpmtrans.py(409):         elif what == rpm.RPMCALLBACK_ELEM_PROGRESS:
rpmtrans.py(414):         elif what == rpm.RPMCALLBACK_INST_OPEN_FILE:
rpmtrans.py(416):         elif what == rpm.RPMCALLBACK_INST_CLOSE_FILE:
rpmtrans.py(418):         elif what == rpm.RPMCALLBACK_INST_PROGRESS:
rpmtrans.py(420):         elif what == rpm.RPMCALLBACK_UNINST_STOP:
rpmtrans.py(422):         elif what == rpm.RPMCALLBACK_CPIO_ERROR:
rpmtrans.py(424):         elif what == rpm.RPMCALLBACK_UNPACK_ERROR:
rpmtrans.py(426):         elif what == rpm.RPMCALLBACK_SCRIPT_ERROR:
rpmtrans.py(428):         elif what == rpm.RPMCALLBACK_SCRIPT_STOP:

I screwed around in the rpmtrans.py file, adding some more logging, but I could ultimately not pinpoint the issue. Seems like I'm all out of ideas for now.

gbraad commented 7 years ago

Is it possible to run rpm itself with a basic package? If not, how about rpm2cpio? If rpm2cpio would succeed, it is unlikely this is related to the packaging extraction process.

It might be related to the rpmdb (Berkeley db library IIRC). At least it seems it can also occur on fedora24. I had a transaction that got killed (Google Chrome) and since my rpmdb is borked and useless. Every install results in a corruption or even refuses to start now. Will investigate a little over the weekend.

@RoliSoft, are you on an Insider release?

kstange commented 7 years ago

I was able to install sudo manually using the RPM command (rpm -ihv ), which lead me to get the idea to try to bypass DNF for updating any out-of-date system packages. It seems to be DNF specifically that can't operate on the RPM database without corrupting it, but after updating all available packages, it seems like it might have fixed itself.

I did a manual update of available packages like this:

# dnf update --downloadonly -y
# rpm -ihv http://mirror.steadfast.net/fedora/releases/25/Everything/x86_64/os/Packages/f/findutils-4.6.0-8.fc25.x86_64.rpm
# find /var/cache/dnf/ -name \*.rpm | xargs rpm -Uhv

These are the updates installed:

================================================================================
 Package                 Arch        Version                 Repository    Size
================================================================================
Upgrading:
 audit-libs              x86_64      2.7.5-1.fc25            updates      107 k
 ca-certificates         noarch      2017.2.11-1.1.fc25      updates      482 k
 curl                    x86_64      7.51.0-6.fc25           updates      307 k
 dbus                    x86_64      1:1.11.12-1.fc25        updates      251 k
 dbus-libs               x86_64      1:1.11.12-1.fc25        updates      174 k
 file-libs               x86_64      5.29-4.fc25             updates      496 k
 gdbm                    x86_64      1.13-1.fc25             updates      154 k
 hawkey                  x86_64      0.6.4-3.fc25            updates       64 k
 libcurl                 x86_64      7.51.0-6.fc25           updates      267 k
 libidn2                 x86_64      2.0.0-1.fc25            updates       93 k
 nss                     x86_64      3.29.3-1.1.fc25         updates      860 k
 nss-pem                 x86_64      1.0.3-3.fc25            updates       76 k
 nss-softokn             x86_64      3.29.3-1.0.fc25         updates      385 k
 nss-softokn-freebl      x86_64      3.29.3-1.0.fc25         updates      225 k
 nss-sysinit             x86_64      3.29.3-1.1.fc25         updates       61 k
 nss-tools               x86_64      3.29.3-1.1.fc25         updates      503 k
 nss-util                x86_64      3.29.3-1.1.fc25         updates       83 k
 openldap                x86_64      2.4.44-10.fc25          updates      352 k
 p11-kit                 x86_64      0.23.2-3.fc25           updates      149 k
 p11-kit-trust           x86_64      0.23.2-3.fc25           updates      129 k
 python3-hawkey          x86_64      0.6.4-3.fc25            updates       46 k
 tzdata                  noarch      2017b-1.fc25            updates      419 k
 vim-minimal             x86_64      2:8.0.562-1.fc25        updates      519 k

Transaction Summary
================================================================================
Upgrade  23 Packages

Now I'm able to run dnf to install packages.

gbraad commented 7 years ago

@kstange Running on the "Anniversary Update" or "Creators Update"?

kstange commented 7 years ago

This is on the creators update.

gbraad commented 7 years ago

The Creators Update has a lot of uodates to WSL/LXSS, os it is likely this got fixed due to a recent change. I haven't been able to verify it myself, but I was also able to run software which previously failed, such as go. I'll try this also soon... but likely this got fixed by the updates mentioned at:

https://blogs.msdn.microsoft.com/commandline/2017/04/11/windows-10-creators-update-whats-new-in-bashwsl-windows-console/

https://msdn.microsoft.com/en-us/commandline/wsl/release_notes

kstange commented 7 years ago

Well, no. The problem (corrupted RPM database) occurred on a new root with DNF on the creators update if I attempted to use DNF to install or update any software. It only started working normally after installing the updates by invoking RPM directly. Thus, I'm assuming one of those packages fixed the issue. If that's not the case, I'm not sure how it could be fixed.

xunlinkx commented 7 years ago

Confirmed still hitting this problem on the latest release of WSL in Creators release and Fedora 25. Thanks for the steps to workaround @kstange.

shls commented 7 years ago

Hi, I encounter similar issue. After install the fedora, it does not have Internet connection. However the original Ubuntu is fine. Do you have any idea about this issue? Thanks.

xunlinkx commented 7 years ago

Yes, make sure /etc/resolv.conf is pointing to appropriate nameserver ip.

brad-x commented 7 years ago

@kstange: I found that applying the updates allowed dnf to function normally, except that the issue re-occurred after adding rpmfusion repo. I think it's triggered by updating something to do with repo information.

kstange commented 7 years ago

I agree. It seems like it's related to importing RPM GPG keys after some further experimentation.

gbraad commented 7 years ago

This has been resolved in later releases of Windows (build 16196, current avail on Fast ring of Insider). Since we will be releasing an official Fedora on WSL, there is no need to further looking into this.