mesonbuild / meson

The Meson Build System
http://mesonbuild.com
Apache License 2.0
5.58k stars 1.62k forks source link

Ctrl-C during build can leave builddir in unrecoverable state #3511

Closed benjamin-otte closed 6 years ago

benjamin-otte commented 6 years ago

Steps to reproduce:

  1. Make sure meson --regenerate needs to be run
  2. Run ninja
  3. While ninja runs meson --regenerate, press Ctrl-C at exactly the right time
  4. Run ninja again
  5. See this: "Something went terribly wrong. Please file a bug."

After trying to track down what that error even means (the error message could really be more informative), it turns out my coredata.dat was missing.

It would be nice if Ctrl-C during a build would never leave the builddir in a non-recoverable state.

Salamandar commented 6 years ago

Happens to me all the time. Using a temporary build.ninja file and atomic replace would easily solve the problem.

nirbheek commented 6 years ago

Using a temporary build.ninja file and atomic replace would easily solve the problem.

We already do this, see generate() in backends/ninjabackend.py. The issue seems to be something else.

jpakkane commented 6 years ago

A strong argument is coredata.py, function save.

Salamandar commented 6 years ago

I confirm that the build.ninja file is intact. When doing a test today, i got :

Found ninja-1.8.2.git at /usr/bin/ninja
^CTraceback (most recent call last):
  File "/usr/bin/meson", line 4, in <module>
    __import__('pkg_resources').run_script('meson==0.47.0.dev1', 'meson')
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 658, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/usr/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1438, in run_script
    exec(code, namespace, namespace)
  File "/usr/lib/python3.6/site-packages/meson-0.47.0.dev1-py3.6.egg/EGG-INFO/scripts/meson", line 26, in <module>
    sys.exit(main())
  File "/usr/lib/python3.6/site-packages/meson-0.47.0.dev1-py3.6.egg/EGG-INFO/scripts/meson", line 23, in main
    return mesonmain.run(sys.argv[1:], launcher)
  File "/usr/lib/python3.6/site-packages/meson-0.47.0.dev1-py3.6.egg/mesonbuild/mesonmain.py", line 352, in run
    app.generate()
  File "/usr/lib/python3.6/site-packages/meson-0.47.0.dev1-py3.6.egg/mesonbuild/mesonmain.py", line 132, in generate
    self._generate(env)
  File "/usr/lib/python3.6/site-packages/meson-0.47.0.dev1-py3.6.egg/mesonbuild/mesonmain.py", line 189, in _generate
    g.generate(intr)
  File "/usr/lib/python3.6/site-packages/meson-0.47.0.dev1-py3.6.egg/mesonbuild/backend/ninjabackend.py", line 220, in generate
    self.generate_target(t, outfile)
  File "/usr/lib/python3.6/site-packages/meson-0.47.0.dev1-py3.6.egg/mesonbuild/backend/ninjabackend.py", line 476, in generate_target
    elem.write(outfile)
  File "/usr/lib/python3.6/site-packages/meson-0.47.0.dev1-py3.6.egg/mesonbuild/backend/ninjabackend.py", line 120, in write
    for i in elems:
KeyboardInterrupt
ninja: error: rebuilding 'build.ninja': interrupted by user

and then to get the files modified today :

$ find . -mtime -1 -ls                                                                                                                                                                                                             [16:50:49]
   803178      4 drwxr-xr-x  35  felix    felix        4096 mai 15 16:49 .
   801577     72 -rw-r--r--   1  felix    felix       71554 mai 15 16:49 ./meson-logs/meson-log.txt
   803204      4 drwxr-xr-x   2  felix    felix        4096 mai 15 16:49 ./etc
   803216      4 drwxr-xr-x   3  felix    felix        4096 mai 15 16:49 ./libgimpmodule
   801324      0 lrwxrwxrwx   1  felix    felix          26 mai 15 16:49 ./libgimpmodule/libgimpmodule-3.0.so.0 -> libgimpmodule-3.0.so.0.0.0
   801325      0 lrwxrwxrwx   1  felix    felix          22 mai 15 16:49 ./libgimpmodule/libgimpmodule-3.0.so -> libgimpmodule-3.0.so.0
   803217      4 drwxr-xr-x   4  felix    felix        4096 mai 15 16:49 ./libgimpthumb
   801326      0 lrwxrwxrwx   1  felix    felix          25 mai 15 16:49 ./libgimpthumb/libgimpthumb-3.0.so.0 -> libgimpthumb-3.0.so.0.0.0
   801327      0 lrwxrwxrwx   1  felix    felix          21 mai 15 16:49 ./libgimpthumb/libgimpthumb-3.0.so -> libgimpthumb-3.0.so.0
   803203      4 drwxr-xr-x   2  felix    felix        4096 mai 15 16:49 ./desktop
   803213      4 drwxr-xr-x   4  felix    felix        4096 mai 15 16:49 ./libgimpcolor
   799349      0 lrwxrwxrwx   1  felix    felix          25 mai 15 16:49 ./libgimpcolor/libgimpcolor-3.0.so.0 -> libgimpcolor-3.0.so.0.0.0
   799407      0 lrwxrwxrwx   1  felix    felix          21 mai 15 16:49 ./libgimpcolor/libgimpcolor-3.0.so -> libgimpcolor-3.0.so.0
   803190      4 drwxr-xr-x   2  felix    felix        4096 mai 15 16:49 ./cursors
   803179      4 drwxr-xr-x   2  felix    felix        4096 mai 15 16:49 ./meson-private
   801607      4 -rw-r--r--   1  felix    felix         648 mai 15 16:49 ./meson-private/gimp-3.0.pc
   801608      4 -rw-r--r--   1  felix    felix         319 mai 15 16:49 ./meson-private/gimpthumb-3.0.pc
   801613      4 -rw-r--r--   1  felix    felix        1499 mai 15 16:49 ./meson-private/meson_exe_gimp-mkenums_2401bae0f990441b88529927c24c72c5fa4dd61e.dat
   801631      4 -rw-r--r--   1  felix    felix        1568 mai 15 16:49 ./meson-private/meson_exe_gimp-mkenums_4912b054e2b3cb16052545d6cea02b8577b887e1.dat
   801625      4 -rw-r--r--   1  felix    felix         911 mai 15 16:49 ./meson-private/meson_exe_gimp-mkenums_7bf97f7151604e56f72ea7d7f2bbf598b8ad0201.dat
   801578      0 -rw-r--r--   1  felix    felix           0 mai 15 16:49 ./meson-private/meson.lock
   801609      4 -rw-r--r--   1  felix    felix         417 mai 15 16:49 ./meson-private/gimpui-3.0.pc
   801620      4 -rw-r--r--   1  felix    felix        1508 mai 15 16:49 ./meson-private/meson_exe_gimp-mkenums_330998687017fe9aa74d6f18f15c5c48c54d10c1.dat
   801612      4 -rw-r--r--   1  felix    felix        1482 mai 15 16:49 ./meson-private/meson_exe_gimp-mkenums_c476306a7a09330cb637802bca45a36830e968f5.dat
   801628      4 -rw-r--r--   1  felix    felix        1514 mai 15 16:49 ./meson-private/meson_exe_gimp-mkenums_b6a4ada126fbe6a22a31c885a2bf4a7d91acd531.dat
   801632      4 -rw-r--r--   1  felix    felix         362 mai 15 16:49 ./meson-private/meson_exe_cat_8143c3cb4c21a0d9159e13f22d17e2e452576d12.dat
   803219      4 drwxr-xr-x   4  felix    felix        4096 mai 15 16:49 ./libgimp
   801338      0 lrwxrwxrwx   1  felix    felix          18 mai 15 16:49 ./libgimp/libgimpui-3.0.so -> libgimpui-3.0.so.0
   801333      0 lrwxrwxrwx   1  felix    felix          16 mai 15 16:49 ./libgimp/libgimp-3.0.so -> libgimp-3.0.so.0
   801332      0 lrwxrwxrwx   1  felix    felix          20 mai 15 16:49 ./libgimp/libgimp-3.0.so.0 -> libgimp-3.0.so.0.0.0
   801336      0 lrwxrwxrwx   1  felix    felix          22 mai 15 16:49 ./libgimp/libgimpui-3.0.so.0 -> libgimpui-3.0.so.0.0.0
   803215      4 drwxr-xr-x   3  felix    felix        4096 mai 15 16:49 ./libgimpconfig
   801286      0 lrwxrwxrwx   1  felix    felix          26 mai 15 16:49 ./libgimpconfig/libgimpconfig-3.0.so.0 -> libgimpconfig-3.0.so.0.0.0
   801323      0 lrwxrwxrwx   1  felix    felix          22 mai 15 16:49 ./libgimpconfig/libgimpconfig-3.0.so -> libgimpconfig-3.0.so.0
   803214      4 drwxr-xr-x   3  felix    felix        4096 mai 15 16:49 ./libgimpmath
   801284      0 lrwxrwxrwx   1  felix    felix          24 mai 15 16:49 ./libgimpmath/libgimpmath-3.0.so.0 -> libgimpmath-3.0.so.0.0.0
   801285      0 lrwxrwxrwx   1  felix    felix          20 mai 15 16:49 ./libgimpmath/libgimpmath-3.0.so -> libgimpmath-3.0.so.0
   799336    492 -rw-r--r--   1  felix    felix      501338 mai 15 16:49 ./build.ninja~
   803218      4 drwxr-xr-x   4  felix    felix        4096 mai 15 16:49 ./libgimpwidgets
   801329      0 lrwxrwxrwx   1  felix    felix          23 mai 15 16:49 ./libgimpwidgets/libgimpwidgets-3.0.so -> libgimpwidgets-3.0.so.0
   801328      0 lrwxrwxrwx   1  felix    felix          27 mai 15 16:49 ./libgimpwidgets/libgimpwidgets-3.0.so.0 -> libgimpwidgets-3.0.so.0.0.0
   803251      4 drwxr-xr-x   3  felix    felix        4096 mai 15 16:49 ./plug-ins/pagecurl
   803209      4 drwxr-xr-x   2  felix    felix        4096 mai 15 16:49 ./icons/Symbolic-Inverted
   803206      4 drwxr-xr-x   2  felix    felix        4096 mai 15 16:49 ./icons/Color
   803208      4 drwxr-xr-x   2  felix    felix        4096 mai 15 16:49 ./icons/Symbolic
   803207      4 drwxr-xr-x   2  felix    felix        4096 mai 15 16:49 ./icons/Legacy
   811011      4 drwxr-xr-x   2  felix    felix        4096 mai 15 16:49 ./docs
   803181      4 drwxr-xr-x   4  felix    felix        4096 mai 15 16:49 ./libgimpbase
   799337      0 lrwxrwxrwx   1  felix    felix          24 mai 15 16:49 ./libgimpbase/libgimpbase-3.0.so.0 -> libgimpbase-3.0.so.0.0.0
   799345      0 lrwxrwxrwx   1  felix    felix          20 mai 15 16:49 ./libgimpbase/libgimpbase-3.0.so -> libgimpbase-3.0.so.0
Salamandar commented 6 years ago

Okay, a bit more investigating indicated that coredata.dat does not exist if meson exits in generate().

jpakkane commented 6 years ago

Can you test the linked MR?

Salamandar commented 6 years ago

Doesn't seem to help. coredata.dat still doesn't exist when ctrl-c.

Salamandar commented 6 years ago

That's strange. I tried to track when the file is deleted with

$ while true; do
> if [[ ! -f coredata.dat ]]; then echo "coredata.dat does not exist"; fi
> done

This is not triggered when calling ninja reconfigure, but only when aborting the reconfigure. I'll investigate some "scope guard" deletor.

Salamandar commented 6 years ago

Yeah that was just below my eyes : (mesonmain.py)

        try:
            dumpfile = os.path.join(env.get_scratch_dir(), 'build.dat')
            # We would like to write coredata as late as possible since we use the existence of
            # this file to check if we generated the build file successfully. Since coredata
            # includes settings, the build files must depend on it and appear newer. However, due
            # to various kernel caches, we cannot guarantee that any time in Python is exactly in
            # sync with the time that gets applied to any files. Thus, we dump this file as late as
            # possible, but before build files, and if any error occurs, delete it.
            cdf = env.dump_coredata()
            if self.options.profile:
                fname = 'profile-{}-backend.log'.format(self.options.backend)
                fname = os.path.join(self.build_dir, 'meson-private', fname)
                profile.runctx('g.generate(intr)', globals(), locals(), filename=fname)
            else:
                #exit()
                g.generate(intr)
            build.save(b, dumpfile)
            # Post-conf scripts must be run after writing coredata or else introspection fails.
            g.run_postconf_scripts()
        except:
            if 'cdf' in locals():
                os.unlink(cdf)
            raise

If I uncomment the exit() call, I suppose the except block is called and the file is deleted.

That behaviour is bad, because a failed build does not mean the build cannot be fixed later. We should be able to "try reconfigure" multiple times and not be blocked by the deletion of this file.

jpakkane commented 6 years ago

Can you test again with the current version?

Salamandar commented 6 years ago

Yes, it works fine.

But the internal logic is a bit flawed here :

else:
  os.unlink(cdf)

should never occur. The coredata.dat file should just always exist (and act as .prev file), then should be replaced with the temp file whenever the reconfigure is successful.

jpakkane commented 6 years ago

The coredata.dat file should just always exist

Except on the first run. We don't want to leave it around in case of failures because it is not guaranteed to work then.

Salamandar commented 6 years ago

Agreed I guess. But once it exists, there's no reason to remove it, right ? (That's exactly what your changes do anyway)

jpakkane commented 6 years ago

Never leave stale data around if you can help it. Sooner or later it will bite you in the ass. :)

Salamandar commented 6 years ago

Okay, but that's what d1287ed321d78848a9d544497edb5c7b239680e5 does : it re-creates coredata.dat with the old content if the reconfigure fails. (Well, only if the meson process still exists at this point, okay)