intel / murphy

Resource Policy
Other
18 stars 15 forks source link

murphyd segfaults #3

Closed pohly closed 10 years ago

pohly commented 10 years ago

murphyd segfaults here while using the D-Bus API. "Here" is some kind of SUSE flavor (not sure exactly, ask if it matters; it has g++ (SUSE Linux) 4.7.2 20130108 [gcc-4_7-branch revision 195012]), murphyd compiled from source (latest as of today, fb72a9f1) with "-g -O2".

Looks like a jump to invalid memory in gdb in valgrind:

Program received signal SIGSEGV, Segmentation fault. 0x000000000066c450 in ?? () (gdb) where

0 0x000000000066c450 in ?? ()

1 0x0000000000000000 in ?? ()

==25826== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2) vex amd64->IR: unhandled instruction bytes: 0x6D 0x65 0x6D 0x6F 0x74 0x6F 0x6F 0x0 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==25833== valgrind: Unrecognised instruction at address 0x793dbc0. ==25833== at 0x793DBC0: ??? ==25833== Your program just tried to execute an instruction that Valgrind ==25833== did not recognise. There are two possible reasons for this. ==25833== 1. Your program has a bug and erroneously jumped to a non-code ==25833== location. If you are running Memcheck and you just saw a ==25833== warning about a bad jump, it's probably your program's fault. ==25833== 2. The instruction is legitimate but Valgrind doesn't handle it, ==25833== i.e. it's Valgrind's fault. If you think this is the case or ==25833== you are not sure, please let us know and we'll try to fix it.

It works when I compile with "-g", i.e. without optimization.

To reproduce: $ murphyd -f -c murphy-lua.conf & $ kill-murphy.py

Setup:

$ cat testing/murphy-lua.conf
#set resolver-ruleset 'src/resolver/test-input'
load-plugin lua config="/home/nightly/testing/murphy-ref.lua"

nightly@syncev:~$ cat testing/murphy-ref.lua
m = murphy.get()

-- try loading console plugin
m:try_load_plugin('console')

-- m:try_load_plugin('console', 'webconsole', {
--                               address = 'wsck:127.0.0.1:3000/murphy',
--                               httpdir = 'src/plugins/console',
-- --                              sslcert = 'src/plugins/console/console.crt',
-- --                              sslpkey = 'src/plugins/console/console.key'
--          })

-- m:try_load_plugin('systemd')

-- load a test plugin
if m:plugin_exists('test.disabled') then
    m:load_plugin('test', {
                       string2  = 'this is now string2',
                       boolean2 = true,
                       int32 = -981,
                       double = 2.73,
                       object = {
                           foo = 1,
                           bar = 'bar',
                           foobar = 3.141,
                           barfoo = 'bar foo',
                           array = { 'one', 'two', 'three',
                                     { 1, 'two', 3, 'four' } },
                           yees = true,
                           noou = false
                       }
                 })
--    m:load_plugin('test', 'test2')
--    m:info("Successfully loaded two instances of test...")
end

-- load the dbus plugin if it exists
-- if m:plugin_exists('dbus') then
--     m:load_plugin('dbus')
-- end

-- load glib plugin, ignoring any errors
-- m:try_load_plugin('glib')

-- load the native resource plugin
if m:plugin_exists('resource-native') then
    m:load_plugin('resource-native')
    m:info("native resource plugin loaded")
else
    m:info("No native resource plugin found...")
end

-- load the dbus resource plugin
if m:plugin_exists('resource-dbus') then
    m:try_load_plugin('resource-dbus', {
        dbus_bus = "session",
        dbus_service = "org.Murphy",
        dbus_track = true,
        default_zone = "driver",
        default_class = "implicit"
      })
    m:info("dbus resource plugin loaded")
else
    m:info("No dbus resource plugin found...")
end

-- load the WRT resource plugin
if m:plugin_exists('resource-wrt') then
    m:try_load_plugin('resource-wrt', {
                          address = "wsck:127.0.0.1:4000/murphy",
                          httpdir = "src/plugins/resource-wrt",
--                          sslcert = 'src/plugins/resource-wrt/resource.crt',
--                          sslpkey = 'src/plugins/resource-wrt/resource.key'
                      })
else
    m:info("No WRT resource plugin found...")
end

-- load the domain control plugin if it exists
if m:plugin_exists('domain-control') then
    m:load_plugin('domain-control')
else
    m:info("No domain-control plugin found...")
end

-- -- load the domain control plugin if it exists
-- if m:plugin_exists('domain-control') then
--     m:try_load_plugin('domain-control', 'wrt-export', {
--         external_address = '',
--         internal_address = '',
--         wrt_address = "wsck:127.0.0.1:5000/murphy",
--         httpdir     = "src/plugins/domain-control"
--     })
-- else
--     m:info("No domain-control plugin found...")
-- end

-- define application classes
application_class { name="implicit" , priority=0 , modal=false, share=true , order="lifo" }

-- define zone attributes
zone.attributes {
}

-- define zones
zone {
     name = "driver"
}

-- define resource classes
resource.class {
     name = "audio_playback",
     shareable = true,
     attributes = {
         role = { mdb.string, "music", "rw" },
         pid = { mdb.string, "<unknown>", "rw" },
         policy = { mdb.string, "relaxed", "rw" }
     }
}

-- SyncEvolution resources: one per runtest.py
-- Some tests can run in parallel. Those resources are shareable.
for i,v in pairs {
    -- compiling the source on one platform
    "compile",

    -- checking out source
    "libsynthesis",
    "syncevolution",
    "activesyncd",

    -- local tests
    "evolution",
    "dbus",
    "pim",
    } do
    resource.class {
        name = v,
        shareable = true
    }
end

-- TODO (in runtests.py): some of these resources overlap
for i,v in pairs {
    -- tests involving unique peers
    "googlecalendar",
    "googlecontacts",
    "owndrive",
    "yahoo",
    "oracle",
    "davical",
    "apple",
    "googleeas",
    "exchange",
    "edsfile",
    "edseds",
    "edsxfile",
    "davfile",
    "edsdav",
    "mobical",
    "memotoo",
    } do
    resource.class {
        name = v,
        shareable = false
    }
end

-- test for creating selections: don't remove, murphyd won't start without it
-- (E: Failed to enable resolver autoupdate.)
mdb.select {
           name = "audio_owner",
           table = "audio_playback_owner",
           columns = {"application_class"},
           condition = "zone_name = 'driver'",
}

$ cat testing/kill-murphy.py
#! /usr/bin/python -u

gobject = None
try:
    import gobject
except ImportError:
    try:
         from gi.repository import GObject as gobject
    except ImportError:
         pass
import dbus
from dbus.mainloop.glib import DBusGMainLoop
DBusGMainLoop(set_as_default=True)
bus = dbus.SessionBus()
loop = gobject.MainLoop()
murphy = dbus.Interface(bus.get_object('org.Murphy', '/org/murphy/resource'), 'org.murphy.manager')
path = murphy.createResourceSet()
resourceset = dbus.Interface(bus.get_object('org.Murphy', path), 'org.murphy.resourceset')
resourceset.addResource('libsynthesis') # This resource must exist, otherwise this script hangs!
resourceset.request()
while resourceset.getProperties()['status'] == 'pending':
    loop.get_context().iteration(True)
resourceset.release()
resourceset.delete()
pohly commented 10 years ago

While I am at it: murphyd refuses to start when I remove the (useless) mdb.select { "audio_owner" } from the murphy-ref.lua.

It says: E: Failed to enable resolver autoupdate. then quits.

pohly commented 10 years ago

And one more question related to the Python test script: if the resource name is something that hasn't been defined, the script just hangs waiting for a status change forever. Is that intentional? Wouldn't it be better to report that as an error, ideally by throwing an error in request()?

ipuustin commented 10 years ago

Thanks for the bug report. I'll take a look at the D-Bus segfault. I think the autoupdate enabling failure would warrant a completely new issue!

klihub commented 10 years ago

Thanks, I'll take a look at that. My initial guess is that it bails out if the generated resolver ruleset is empty, but I'll take a closer look.

Cheers, kli

On 10 Dec 2013, at 22:02, Patrick Ohly notifications@github.com wrote:

While I am at it: murphyd refuses to start when I remove the (useless) mdb.select { "audio_owner" } from the murphy-ref.lua.

It says: E: Failed to enable resolver autoupdate. then quits.

— Reply to this email directly or view it on GitHub.

ipuustin commented 10 years ago

Alright, I think I found the bug. Could you try the latest master to see if the problem persists? Thanks.

pohly commented 10 years ago

Yes, master seems to work now even when compiled with -O2.

ipuustin commented 10 years ago

Ok, I'm closing this issue.