NixOS / nix

Nix, the purely functional package manager
https://nixos.org/
GNU Lesser General Public License v2.1
12.21k stars 1.47k forks source link

Nix remote build coredumps when mixing nix versions #4664

Open SuperSandro2000 opened 3 years ago

SuperSandro2000 commented 3 years ago

Describe the bug

I had different nix versions for my nix-daemon and user nix binaries inplace which leads to coredumps when copying the derivations.

Steps To Reproduce

  1. Install older nix version globally
  2. install a new one in home enviroment
  3. build remote with substituters
  4. wait for failed due to signal 6 (Aborted)

Expected behavior

It should not coredump or fail with a nicer message than coredumping

nix-env --version output global env

nix-env --version
nix-env (Nix) 2.4pre20210317_8a5203d

user env

nix-env --version
nix-env (Nix) 2.3.10

remote builders

nix-env --version                                                                                                                                         ~
nix-env (Nix) 2.4pre20210308_1c0e3e4

Additional context

           PID: 1488454 (build-remote)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 6 (ABRT)
     Timestamp: Tue 2021-03-23 16:28:33 CET (45s ago)
  Command Line: build-remote 2
    Executable: /nix/store/iwfs2bfcy7lqwhri94p2i6jc87ih55zk-nix-2.3.10/bin/nix
 Control Group: /system.slice/nix-daemon.service
          Unit: nix-daemon.service
         Slice: system.slice
       Boot ID: 49f19cee9e8442938e12c49a56968522
    Machine ID: a5a794bb5471437ea4cd08bdde375077
      Hostname: hydrogen
       Storage: /var/lib/systemd/coredump/core.build-remote.0.49f19cee9e8442938e12c49a56968522.1488454.1616513313000000.zst
       Message: Process 1488454 (build-remote) of user 0 dumped core.

                Stack trace of thread 1488454:
                #0  0x00007f278dbf015a raise (libc.so.6 + 0x3815a)
                #1  0x00007f278dbda548 abort (libc.so.6 + 0x22548)
                #2  0x00007f278dbda42f __assert_fail_base.cold.0 (libc.so.6 + 0x2242f)
                #3  0x00007f278dbe8ad2 __assert_fail (libc.so.6 + 0x30ad2)
                #4  0x00007f278e2f6a06 _ZN3nix14LegacySSHStore21queryPathInfoUncachedERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_8CallbackISt10shared_ptrINS_13ValidPathInfoEEEE (libnixstore.so + 0x16ba06)
                #5  0x00007f278e3685a9 _ZN3nix5Store13queryPathInfoERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_8CallbackINS_3refINS_13ValidPathInfoEEEEE (libnixstore.so + 0x1dd5a9)
                #6  0x00007f278e368cae _ZN3nix5Store13queryPathInfoERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE (libnixstore.so + 0x1ddcae)
                #7  0x00007f278e369e74 _ZNSt17_Function_handlerIFSt3setINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4lessIS6_ESaIS6_EERKS6_EZN3nix9copyPathsENSE_3refINSE_5StoreEEESH_RKSA_NSE_10RepairFlagENSE_13CheckSigsFlagENSE_14SubstituteFlagEEUlSC_E0_E9_M_invokeERKSt9_Any_dataSC_ (libnixstore.so + 0x1dee74)
                #8  0x00007f278e370e4b _ZZN3nix12processGraphINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRNS_10ThreadPoolERKSt3setIT_St4lessISA_ESaISA_EESt8functionIFSE_RKSA_EESH_IFvSJ_EEENKUlRKS6_E_clESP_ (libnixstore.so + 0x1e5e4b)
                #9  0x00007f278e15541a _ZN3nix10ThreadPool6doWorkEb (libnixutil.so + 0x8141a)
                #10 0x00007f278e156c75 _ZN3nix10ThreadPool7processEv (libnixutil.so + 0x82c75)
                #11 0x00007f278e36ff8f _ZN3nix12processGraphINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRNS_10ThreadPoolERKSt3setIT_St4lessISA_ESaISA_EESt8functionIFSE_RKSA_EESH_IFvSJ_EE (libnixstore.so + 0x1e4f8f)
                #12 0x00007f278e36d1fd _ZN3nix9copyPathsENS_3refINS_5StoreEEES2_RKSt3setINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4lessIS9_ESaIS9_EENS_10RepairFlagENS_13CheckSigsFlagENS_14SubstituteFlagE (libnixstore.so + 0x1e21fd)
                #13 0x000000000044a7af _ZL5_mainiPPc (nix + 0x4a7af)
                #14 0x00000000004eeeea _ZN3nix11mainWrappedEiPPc (nix + 0xeeeea)
                #15 0x00007f278e3fd0f2 _ZN3nix16handleExceptionsERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt8functionIFvvEE (libnixmain.so + 0x210f2)
                #16 0x0000000000447514 main (nix + 0x47514)
                #17 0x00007f278dbdbd8b __libc_start_main (libc.so.6 + 0x23d8b)
                #18 0x000000000044804a _start (nix + 0x4804a)

                Stack trace of thread 1488457:
                #0  0x00007f278dd8570d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xe70d)
                #1  0x00007f278e427ec7 GC_wait_marker (libgc.so.1 + 0x17ec7)
                #2  0x00007f278e42834a GC_help_marker (libgc.so.1 + 0x1834a)
                #3  0x00007f278e42841f GC_mark_thread (libgc.so.1 + 0x1841f)
                #4  0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #5  0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)

                Stack trace of thread 1488458:
                #0  0x00007f278dd8570d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xe70d)
                #1  0x00007f278e427ec7 GC_wait_marker (libgc.so.1 + 0x17ec7)
                #2  0x00007f278e42834a GC_help_marker (libgc.so.1 + 0x1834a)
                #3  0x00007f278e42841f GC_mark_thread (libgc.so.1 + 0x1841f)
                #4  0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #5  0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)

                Stack trace of thread 1488455:
                #0  0x00007f278dbf0cbc __sigtimedwait (libc.so.6 + 0x38cbc)
                #1  0x00007f278dd89b74 sigwait (libpthread.so.0 + 0x12b74)
                #2  0x00007f278e15b965 _ZN3nixL19signalHandlerThreadE10__sigset_t (libnixutil.so + 0x87965)
                #3  0x00007f278e16170d _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJPFv10__sigset_tES3_EEEEE6_M_runEv (libnixutil.so + 0x8d70d)
                #4  0x00007f278dfcab40 n/a (libstdc++.so.6 + 0xd6b40)
                #5  0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #6  0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)

                Stack trace of thread 1488456:
                #0  0x00007f278dd8570d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xe70d)
                #1  0x00007f278e427ec7 GC_wait_marker (libgc.so.1 + 0x17ec7)
                #2  0x00007f278e42834a GC_help_marker (libgc.so.1 + 0x1834a)
                #3  0x00007f278e42841f GC_mark_thread (libgc.so.1 + 0x1841f)
                #4  0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #5  0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)

                Stack trace of thread 1488462:
                #0  0x00007f278dd8570d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xe70d)
                #1  0x00007f278e427ec7 GC_wait_marker (libgc.so.1 + 0x17ec7)
                #2  0x00007f278e42834a GC_help_marker (libgc.so.1 + 0x1834a)
                #3  0x00007f278e42841f GC_mark_thread (libgc.so.1 + 0x1841f)
                #4  0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #5  0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)

                Stack trace of thread 1488461:
                #0  0x00007f278dd8570d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xe70d)
                #1  0x00007f278e427ec7 GC_wait_marker (libgc.so.1 + 0x17ec7)
                #2  0x00007f278e42834a GC_help_marker (libgc.so.1 + 0x1834a)
                #3  0x00007f278e42841f GC_mark_thread (libgc.so.1 + 0x1841f)
                #4  0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #5  0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)

                Stack trace of thread 1488460:
                #0  0x00007f278dd8570d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xe70d)
                #1  0x00007f278e427ec7 GC_wait_marker (libgc.so.1 + 0x17ec7)
                #2  0x00007f278e42834a GC_help_marker (libgc.so.1 + 0x1834a)
                #3  0x00007f278e42841f GC_mark_thread (libgc.so.1 + 0x1841f)
                #4  0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #5  0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)

                Stack trace of thread 1489787:
                #0  0x00007f278dd8570d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xe70d)
                #1  0x00007f278dfc586c _ZNSt18condition_variable4waitERSt11unique_lockISt5mutexE (libstdc++.so.6 + 0xd186c)
                #2  0x00007f278e2f5ad7 _ZN3nix4PoolINS_14LegacySSHStore10ConnectionEE3getEv (libnixstore.so + 0x16aad7)
                #3  0x00007f278e2f679c _ZN3nix14LegacySSHStore21queryPathInfoUncachedERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_8CallbackISt10shared_ptrINS_13ValidPathInfoEEEE (libnixstore.so + 0x16b79c)
                #4  0x00007f278e3685a9 _ZN3nix5Store13queryPathInfoERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_8CallbackINS_3refINS_13ValidPathInfoEEEEE (libnixstore.so + 0x1dd5a9)
                #5  0x00007f278e368cae _ZN3nix5Store13queryPathInfoERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE (libnixstore.so + 0x1ddcae)
                #6  0x00007f278e369e74 _ZNSt17_Function_handlerIFSt3setINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4lessIS6_ESaIS6_EERKS6_EZN3nix9copyPathsENSE_3refINSE_5StoreEEESH_RKSA_NSE_10RepairFlagENSE_13CheckSigsFlagENSE_14SubstituteFlagEEUlSC_E0_E9_M_invokeERKSt9_Any_dataSC_ (libnixstore.so + 0x1dee74)
                #7  0x00007f278e370e4b _ZZN3nix12processGraphINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRNS_10ThreadPoolERKSt3setIT_St4lessISA_ESaISA_EESt8functionIFSE_RKSA_EESH_IFvSJ_EEENKUlRKS6_E_clESP_ (libnixstore.so + 0x1e5e4b)
                #8  0x00007f278e15541a _ZN3nix10ThreadPool6doWorkEb (libnixutil.so + 0x8141a)
                #9  0x00007f278dfcab40 n/a (libstdc++.so.6 + 0xd6b40)
                #10 0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #11 0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)

                Stack trace of thread 1488459:
                #0  0x00007f278dd8570d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xe70d)
                #1  0x00007f278e427ec7 GC_wait_marker (libgc.so.1 + 0x17ec7)
                #2  0x00007f278e42834a GC_help_marker (libgc.so.1 + 0x1834a)
                #3  0x00007f278e42841f GC_mark_thread (libgc.so.1 + 0x1841f)
                #4  0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #5  0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)

                Stack trace of thread 1489788:
                #0  0x00007f278dd8570d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0 + 0xe70d)
                #1  0x00007f278dfc586c _ZNSt18condition_variable4waitERSt11unique_lockISt5mutexE (libstdc++.so.6 + 0xd186c)
                #2  0x00007f278e2f5ad7 _ZN3nix4PoolINS_14LegacySSHStore10ConnectionEE3getEv (libnixstore.so + 0x16aad7)
                #3  0x00007f278e2f679c _ZN3nix14LegacySSHStore21queryPathInfoUncachedERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_8CallbackISt10shared_ptrINS_13ValidPathInfoEEEE (libnixstore.so + 0x16b79c)
                #4  0x00007f278e3685a9 _ZN3nix5Store13queryPathInfoERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_8CallbackINS_3refINS_13ValidPathInfoEEEEE (libnixstore.so + 0x1dd5a9)
                #5  0x00007f278e368cae _ZN3nix5Store13queryPathInfoERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE (libnixstore.so + 0x1ddcae)
                #6  0x00007f278e369e74 _ZNSt17_Function_handlerIFSt3setINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt4lessIS6_ESaIS6_EERKS6_EZN3nix9copyPathsENSE_3refINSE_5StoreEEESH_RKSA_NSE_10RepairFlagENSE_13CheckSigsFlagENSE_14SubstituteFlagEEUlSC_E0_E9_M_invokeERKSt9_Any_dataSC_ (libnixstore.so + 0x1dee74)
                #7  0x00007f278e370e4b _ZZN3nix12processGraphINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRNS_10ThreadPoolERKSt3setIT_St4lessISA_ESaISA_EESt8functionIFSE_RKSA_EESH_IFvSJ_EEENKUlRKS6_E_clESP_ (libnixstore.so + 0x1e5e4b)
                #8  0x00007f278e15541a _ZN3nix10ThreadPool6doWorkEb (libnixutil.so + 0x8141a)
                #9  0x00007f278dfcab40 n/a (libstdc++.so.6 + 0xd6b40)
                #10 0x00007f278dd7eedd start_thread (libpthread.so.0 + 0x7edd)
                #11 0x00007f278dcafaaf __clone (libc.so.6 + 0xf7aaf)
SuperSandro2000 commented 3 years ago

I could further pinpoint this issue down to nix-copy-closure --from ssh://server /nix/store/..... core dumping but nix copy --from ssh://server /nix/store/..... works just fine.

lopsided98 commented 3 years ago

The relevant version mismatch seems to be between the local daemon and the remote builder. In my case the following version combinations both result in the error: local CLI: 2.3.10, local daemon: 2.3.10, remote daemon: 2.4pre20210326_dd77f71 local CLI: 2.4pre20210326_dd77f71, local daemon: 2.3.10, remote daemon: 2.4pre20210326_dd77f71

while this combination works: local CLI: 2.4pre20210326_dd77f71, local daemon: 2.4pre20210326_dd77f71, remote daemon: 2.4pre20210326_dd77f71

for completeness, I tried this combination, which resulted in nix-build just hanging forever not building anything: local CLI: 2.3.10, local daemon: 2.4pre20210326_dd77f71, remote daemon: 2.4pre20210326_dd77f71

andir commented 3 years ago

I am seeing the same error when building from 2.3.10 (x86_64-linux) on a 2.4pre20210503_6d2553a (aarch64-linux) box. The stack trace is pretty similar to what has been posted above:

#0  0x00007f5e7bbad33a in raise () from /nix/store/v8q6nxyppy1myi3rxni2080bv8s9jxiy-glibc-2.32-40/lib/libc.so.6
#1  0x00007f5e7bb97523 in abort () from /nix/store/v8q6nxyppy1myi3rxni2080bv8s9jxiy-glibc-2.32-40/lib/libc.so.6
#2  0x00007f5e7bb9741f in __assert_fail_base.cold.0 () from /nix/store/v8q6nxyppy1myi3rxni2080bv8s9jxiy-glibc-2.32-40/lib/libc.so.6
#3  0x00007f5e7bba5d92 in __assert_fail () from /nix/store/v8q6nxyppy1myi3rxni2080bv8s9jxiy-glibc-2.32-40/lib/libc.so.6
#4  0x00007f5e7c2bf81c in nix::LegacySSHStore::queryPathInfoUncached(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, nix::Callback<std::shared_ptr<nix::ValidPathInfo> >) () from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixstore.so
#5  0x00007f5e7c34459b in nix::Store::queryPathInfo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, nix::Callback<nix::ref<nix::ValidPathInfo> >) ()
   from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixstore.so
#6  0x00007f5e7c344a0a in nix::Store::queryPathInfo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
   from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixstore.so
#7  0x00007f5e7c345d04 in std::_Function_handler<std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&), nix::copyPaths(nix::ref<nix::Store>, nix::ref<nix::Store>, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, nix::RepairFlag, nix::CheckSigsFlag, nix::SubstituteFlag)::{lambda(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#2}>::_M_invoke(std::_Any_data const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixstore.so
#8  0x00007f5e7c34b241 in nix::processGraph<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(nix::ThreadPool&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::function<std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)>, std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)>)::{lambda(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)#1}::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const ()
   from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixstore.so
#9  0x00007f5e7c10c2ca in nix::ThreadPool::doWork(bool) () from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixutil.so
#10 0x00007f5e7c10d085 in nix::ThreadPool::process() () from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixutil.so
#11 0x00007f5e7c348a4f in void nix::processGraph<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(nix::ThreadPool&, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::function<std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)>, std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)>) ()
   from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixstore.so
#12 0x00007f5e7c3409fd in nix::copyPaths(nix::ref<nix::Store>, nix::ref<nix::Store>, std::set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, nix::RepairFlag, nix::CheckSigsFlag, nix::SubstituteFlag) () from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixstore.so
#13 0x000000000044e3cc in _main(int, char**) ()
#14 0x000000000050a0b0 in nix::mainWrapped(int, char**) ()
#15 0x00007f5e7c3dd2e9 in nix::handleExceptions(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void ()>) ()
   from /nix/store/0mfa9hrs5pascjqvq4q7bz1qp4ql72w2-nix-2.3.10/lib/libnixmain.so
#16 0x0000000000449c14 in main ()

This can be tested by running a stable nix version locally and remote building on the community aarch64-linux box.

andir commented 3 years ago

The assertion that is being triggered is this one:

https://github.com/NixOS/nix/blob/2.3.11/src/libstore/legacy-ssh-store.cc#L103-L106

I made that assertion print an error before aborting like so (on the 2.3 client):

diff --git a/src/libstore/legacy-ssh-store.cc b/src/libstore/legacy-ssh-store.cc
index d5fbdd25a..a2e552e6c 100644
--- a/src/libstore/legacy-ssh-store.cc
+++ b/src/libstore/legacy-ssh-store.cc
@@ -101,7 +101,10 @@ struct LegacySSHStore : public Store
             auto info = std::make_shared<ValidPathInfo>();
             conn->from >> info->path;
             if (info->path.empty()) return callback(nullptr);
-            assert(path == info->path);
+            if (path != info->path) {
+                printError("path %s != info->path %s", path, info->path);
+                assert(path == info->path);
+            }

In my case this ended up resulting in this log line:

path /nix/store/mdkb735zm8dfhcyg9frd7ych2y5h38ly-mpp-git != info->path K

Briefly looking at the code this could be related to the storePath refactoring that @Ericson2314 did a while ago. Mind having a look?

thufschmitt commented 3 years ago

In my case this ended up resulting in this log line:

path /nix/store/mdkb735zm8dfhcyg9frd7ych2y5h38ly-mpp-git != info->path K

FWIW, the K remotely looks like the begining of the progress bar (\r\e[K). Might have nothing to do with it but I’ve seen it pop up in a few places because of that, so maybe that’s what’s happening here

andir commented 2 years ago

@Ericson2314 this is still an issue with any system that is still running 2.3 and is connecting to a builder with Nix 2.4. Any chance you could have a look? I am still sure that this is related to that refactoring you did back then.

nixos-discourse commented 2 years ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-2-4-and-what-s-next/16257/1

Ericson2314 commented 2 years ago

Unfortunately I didn't see this one yet with the mixed version testing I was working on. Though perhaps too many remote builder tests were skipped, and fixing that would cause this to be reproduced after all.

thufschmitt commented 2 years ago

@Ericson2314 if you’re referring to https://github.com/NixOS/nix/pull/5602, I don’t think it’s testing the exact setup (because of the daemon thing I mention in https://github.com/NixOS/nix/pull/5602#issuecomment-983361970). You’ll probably need a killDaemon at the start of the test or something like that to make sure that it’s really nixA<->nixB(remote-builder) and not nixA(client)<->nixB(daemon)<->nixB(remote-builder)

Ericson2314 commented 2 years ago

Oh right yes thanks.

nixos-discourse commented 2 years ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/remote-building-experienence-massively-degraded-since-2-3-months/16950/1

stale[bot] commented 2 years ago

I marked this as stale due to inactivity. → More info