root-11 / maslite

A very fast multi agent messaging kernel in Python
MIT License
11 stars 4 forks source link

Revising throughput benchmarks after new subscribe system. #17

Open root-11 opened 9 months ago

root-11 commented 9 months ago

minimal test benchmark

# scratch1.py
from maslite import Agent, Scheduler, AgentMessage

class Msg(AgentMessage):
    def __init__(self, sender, receiver=None, topic=None):
        super().__init__(sender, receiver, topic)
        self.value = 0

class A(Agent):
    def __init__(self, uuid=None):
        super().__init__(uuid)

    def update(self):
        if self.messages:
            m = self.receive()
            assert isinstance(m, Msg)
            m.value += 1
            m.receiver, m.sender = m.sender, m.receiver
            self.send(m)

if __name__ == "__main__":

    s = Scheduler()
    a = A()
    b = A()
    s.add(a)
    s.add(b)
    m = Msg(a,b)
    a.send(m)
    s.run(seconds=10)
    print(f"{m.value/10:,}")
    # python3.10 = 323,611.8
    # pypy3.10 = 3,135,626.1

barebone benchmark

# scratch3.py
from itertools import count
from collections import deque
from time import time

class Msg:
    def __init__(self,s,r) -> None:
        self.sender = s
        self.receiver = r
        self.value = 0

class Agent:
    ids = count(start=1)
    def __init__(self) -> None:
        self.id = next(Agent.ids)
        self.messages = deque()
        self._scheduler = None
    def receive(self):
        return self.messages.popleft()
    def send(self, msg):
        self._scheduler.msg_q.append(msg)
    def update(self):
        if self.messages:
            m = self.receive()
            assert isinstance(m, Msg)
            m.value += 1
            m.receiver, m.sender = m.sender, m.receiver
            self.send(m)

class Scheduler:
    def __init__(self) -> None:
        self.msg_q = []
        self.agents = {}

    def add(self, agent):
        assert isinstance(agent, Agent)
        self.agents[agent.id] = agent
        agent._scheduler = self

    def run(self, seconds):
        start = time()

        loops = 0
        while True:
            loops +=1 
            active_agents = set()
            for m in self.msg_q[:]:
                receiver = self.agents[m.receiver]
                receiver.messages.append(m)
                active_agents.add(m.receiver)
            self.msg_q.clear()
            for id in active_agents:
                self.agents[id].update()

            if loops % 10 == 0:
                loops = 0
                if start + seconds > time():
                    continue
                else:
                    return

if __name__ == "__main__":

    s = Scheduler()
    a = Agent()
    b = Agent()
    s.add(a)
    s.add(b)
    m = Msg(a.id,b.id)
    a.send(m)
    s.run(seconds=10)
    print(f"{m.value/10:,}")   
    # python3.10 = 1,285,197.0
    # pypy3.10 = 15,908,693.0

Summarized:

maslite now lower bound diff
python310 323,611.8 1,285,197.0 4x
pypy310 3,135,626.1 15,908,693.0 5x

Based on this observation I would like the workgroup to consider changing the implementation to use at 3 message types

class Msg:  # used for 1:1 communication
    def __init__(self,s,r) -> None:
        self.sender = s
        self.receiver = r

class BroadCast:  # used for topic based broadcasts, e.g. one to many
    def __init__(self, s, topic) -> None:
        self.sender = s
        self.topic = topic

class Forum:  # used for narrow band broadcasting 
    def __init__(self,s,forum):
        self.sender = s
        self.forum = forum

as I believe that these three message types will cover all needs and not slow down maslite to the degree which the current patch does.

root-11 commented 8 months ago

Proposals: Disable line 755:

self.subscribe(subscriber=agent.uuid, receiver=None, topic=agent.__class__.__name__)
04t02 commented 8 months ago

After extensive testing, see results.

To benchmark a counter was placed on the distribution of the original message (in send_to_recipients in the Scheduler) - this avoids counting the message duplicates which are sent to subscribers.

Subscribers can subscribe to 2 of sender, receiver, topic. In the table below (True, True, False) indicates a subscriber to the sender receiver combination for any message topic.

image

Profiling the get_mail_recipients function showed that it was responsible for 14 - 16% of the total runtime.

I opted for updating the get_mail_recipients function to deliver an equivalent minimum functionality (akin to that shown in the bare bone benchmark). The function was as follows:

def get_mail_recipients(self, message): return [message.receiver]

The result for direct messages performance was 540, 311 msgs/second.

This is interesting for 2 reasons:

Targeting improvements to get_mail_recipients function:

  1. Currently on MailingList.add() the Agent is subscribed to itself and its class. The first part of this results in there being no distinction between direct messages and subscriptions - all messages are distributed as if they are subscriptions. However, we send significantly more direct messages than we have subscribers, so why not fast-track this? Secondly, why does the Scheduler decide that it is of value to have the agents subscribed to messages targeting their class? This shouldn't be the responsibility of the Scheduler. Experiment 1 removes this.
  2. When subscription is registered it is possible to flag the types of subscription present in the system. This flag can then be used during get_mail_recipients to determine if there is even a need to fetch the directory and potentially prevent the retrieval of the directory completely (for (False, False, False) cases). Experiment 2 introduces this alongside shortcutting the direct messages.
  3. Introduce a separate message type which cannot be subscribed to. This will prevent the directory being fetched for this message type. NOTE: This message type would serve for the DiscreteEventMessage and other "kick" type messages, potentially even the NAT messages. Experiment 3 uses only the DirectOnly message type - this can be seen in the subscription levels not having any influence on the system performance.
  4. Send 50% DirectOnly and 50% subscription-eligible messages. Experiment 4.

image

One final investigation I carried out to shed a little more light on the efficiency of the nested dict vs the flag system By counting the conditionals we can see how many calls are made using each of the methods and whether or not the flags add value.

{'results_no_flag': {'mean': 7.165354330708661, 'median': 5.0, 'mode': 5, 'min': 2, 'max': 13, 'complete_miss_mean': 4.267716535433071, 'complete_miss_median': 5, 'complete_miss_mode': 5, 'complete_miss_min': 2, 'complete_miss_max': 5, 'hit_mean': 10.062992125984252, 'hit_median': 11, 'hit_mode': 11, 'hit_min': 2, 'hit_max': 13, ('fff_fft', 'complete_miss'): 5, ('fff_fft', 'hit'): 5}, 'results_with_flag': {'mean': 13.05511811023622, 'median': 12.0, 'mode': 12, 'min': 6, 'max': 24, 'complete_miss_mean': 11.039370078740157, 'complete_miss_median': 11, 'complete_miss_mode': 10, 'complete_miss_min': 6, 'complete_miss_max': 16, 'hit_mean': 15.070866141732283, 'hit_median': 15, 'hit_mode': 15, 'hit_min': 6, 'hit_max': 24, ('fff_fft', 'complete_miss'): 9, ('fff_fft', 'hit'): 9}} The results show that in all cases we are processing more conditional statements with the flags than without.

04t02 commented 8 months ago

I have pushed a release which takes advantage of the direct message style (2023.1.0), the new benchmarked results are:

python3.9 benchmarks.py - 480,440.9 messages/second

pypy3 benchmarks.py - 6,441,251.9 messages/second