OFFIS-DAI / Mango.jl

Modular Julia-based agent framework to implement multi-agent systems
https://offis-dai.github.io/Mango.jl/stable/
MIT License
8 stars 2 forks source link

send_message can hang when using kwargs for meta #29

Closed jsagerOffis closed 3 months ago

jsagerOffis commented 4 months ago

When calling send_message with keyword args from an agent or container the function can block indefinitely without any obvious reason. This may be related (or the same issue) as the indefinite hanging of TCP communications on the mqtt branch TCP example code.

It will work fine when calling send_message from the main function of the program but fail when doing an equivalent call from handle_message.

The entire construct of using kwargs... for meta fields should maybe be revised anyway. It is probably quite prone to such errors because kwargs... passes a Dict{Symbol, Any} which we parse to string by the symbol names. This is already unwieldy to use when we want to pass fields with a name given by some constant and we never really want these fields as symbols regardless.

Maybe just switch to an explicit Dict{String, Any} field to pass meta fields into?

jsagerOffis commented 4 months ago

Addendum: Problem may be unrelated to kwargs and just have been random chance that adding or removing kwargs sometimes made a message pass through and sometimes not. Seems to be a general issue with tcp send_message on the development branch.

jsagerOffis commented 4 months ago

I got a working example of the error happening without kwargs. It seems to be related to calling send_message from handle_message. See example code below (this does not terminate but Billy gets the reply message.

Note that the problem is not waiting for send_message in main. It will not terminate without that, either. Adding more threads (in hopes that something just doesn't let go of the single thread) also did not make it terminate.

using Mango
using Sockets: InetAddr, @ip_str
import Mango.AgentCore.handle_message

@agent struct ReplyAgent

end

@agent struct LazyAgent

end

function handle_message(agent::ReplyAgent, message::Any, meta::Any)
    reply_addr = AgentAddress(meta["sender_id"], meta["sender_addr"], nothing)
    println("$(agent.aid) got a message \nwith content $message \nand meta $meta")

    send_message(agent, "a reply!", reply_addr)
end

function handle_message(agent::LazyAgent, message::Any, meta::Any)
    println("$(agent.aid) got a message \nwith content $message \nand meta $meta")
    println("But will do nothing!")
end

function main()
    c1_addr = InetAddr(ip"127.0.0.1", 5555)
    c1 = Container()
    c1.protocol = TCPProtocol(address=c1_addr)

    c2_addr = InetAddr(ip"127.0.0.1", 5556)
    c2 = Container()
    c2.protocol = TCPProtocol(address=c2_addr)

    a1 = ReplyAgent()
    a2 = LazyAgent()

    register(c1, a1, "Timmy")
    register(c2, a2, "Billy")

    # start container loop
    wait(Threads.@spawn start(c1))
    wait(Threads.@spawn start(c2))

    timmy_addr = AgentAddress("Timmy", c1_addr, nothing)
    billy_addr = AgentAddress("Billy", c2_addr, nothing)

    # send some messages
    wait(send_message(a2, "hello", timmy_addr))

    # stop container loop
    wait(Threads.@spawn shutdown(c1))
    wait(Threads.@spawn shutdown(c2))
end

main()
jsagerOffis commented 4 months ago

Moving both agents to a single container will terminate! So it's something with multi-container and this send_message in the handle_message