SeanezLab / BoMI-StartReact

https://seanezlab.github.io/BoMI-StartReact/
0 stars 0 forks source link

Use generalized Packet class #21

Closed seasonedfish closed 1 year ago

seasonedfish commented 1 year ago

In the 0.3.0 refactoring, I replaced the yost packet data type with dictionaries so that it was generalized to any device manager.

This was a bit of a hacky solution, as the buffer and scope widget expect certain fields, and there was no good way to document/specify this.

With a new, generalized Packet class, we specify that implementing device mangers should put Packet objects that have fields time, device_name, and channel_readings into the passed-in queue.

seasonedfish commented 1 year ago

@seasonedfish you forgor to add Packet to the type annotations in a couple of places 💀

kwsp commented 1 year ago

Did you benchmark the performance of the new Packet class with the old? I'm curious to see the CPU time and memory usage. The old Packet class was implemented as a NamedTuple, which is basically a tuple and extremely cheap. My concerns wrt the new Packet:

  1. @dataclass(frozen=True) is a very weak emulation of immutable datastructure. Specifically, it just raises an exception when assigning to a field in the DC. This gives the false sense that the object is immutable, while the sensor data is actually stored in a dictionary channel_readings, which is still mutable.
  2. Use should use @dataclass(slots=True). Using slots in a python class eliminates the per class attribute dict, and instead allows the class to store attributes in a tuple. Saves a ton of memory, especially for the Packet datastructure. See benchmark below.
  3. Storing attributes in a dict is a bit iffy to me. Dict's are hash tables that take up a lot of memory, and reading/writing to them is another layer of indirection, compared to a tuple.

I wrote some toy code to compare the memory usage of dataclasses and NamedTuples. Dataclasses in my test uses about 4 times more memory than NamedTuples. This may or may not matter in the end - only a real world benchmark will tell. If you think the abstraction is worth it

1 Dataclass:                    904 bytes
1 Dataclass (slots):            584 bytes
1 NamedTuple:                   208 bytes
Dataclass/NamedTuple            4.35x more
Dataclass (slots)/NamedTuple    2.81x more

1 Dataclass:                    393376 bytes
1 Dataclass (slots):            305144 bytes
1 NamedTuple:                   96976 bytes
Dataclass/NamedTuple            4.06x more
Dataclass (slots)/NamedTuple    3.15x more
from typing import Any, NamedTuple
from dataclasses import dataclass

@dataclass
class C_dc:
    x: int
    y: int
    z: int

    readings: dict[str, Any]

@dataclass(slots=True)
class C_dc_slots:
    x: int
    y: int
    z: int

    readings: dict[str, Any]

class C_nt(NamedTuple):
    x: int
    y: int
    z: int

    roll: float
    pitch: float
    yaw: float

dc = C_dc(1, 2, 3, readings={"roll": 1.0, "pitch": 1.0, "yaw": 1.0})
dc_slots = C_dc_slots(1, 2, 3, readings={"roll": 1.0, "pitch": 1.0, "yaw": 1.0})
nt = C_nt(1, 2, 3, roll=1.0, pitch=1.0, yaw=1.0)

from pympler import asizeof

nbytes_dc = asizeof.asizeof(dc)
nbytes_dc_slots = asizeof.asizeof(dc_slots)
nbytes_nt = asizeof.asizeof(nt)
print(f"1 Dataclass:                \t{nbytes_dc} bytes")
print(f"1 Dataclass (slots):        \t{nbytes_dc_slots} bytes")
print(f"1 NamedTuple:               \t{nbytes_nt} bytes")
print(f"Dataclass/NamedTuple        \t{nbytes_dc / nbytes_nt:.2f}x more")
print(f"Dataclass (slots)/NamedTuple \t{nbytes_dc_slots / nbytes_nt:.2f}x more")

l_dc = []
l_dc_slots = []
l_nt = []
for _ in range(1000):
    l_dc.append(C_dc(1, 2, 3, readings={"roll": 1.0, "pitch": 1.0, "yaw": 1.0}))
    l_dc_slots.append(C_dc_slots(1, 2, 3, readings={"roll": 1.0, "pitch": 1.0, "yaw": 1.0}))
    l_nt.append(C_nt(1, 2, 3, roll=1.0, pitch=1.0, yaw=1.0))

nbytes_dc = asizeof.asizeof(l_dc)
nbytes_dc_slots = asizeof.asizeof(l_dc_slots)
nbytes_nt = asizeof.asizeof(l_nt)
print("\n")
print(f"1 Dataclass:                \t{nbytes_dc} bytes")
print(f"1 Dataclass (slots):        \t{nbytes_dc_slots} bytes")
print(f"1 NamedTuple:               \t{nbytes_nt} bytes")
print(f"Dataclass/NamedTuple        \t{nbytes_dc / nbytes_nt:.2f}x more")
print(f"Dataclass (slots)/NamedTuple \t{nbytes_dc_slots / nbytes_nt:.2f}x more")
seasonedfish commented 1 year ago

I did briefly compare the the throughput and framerate before and after the PR (there was no significant regression), but not between the old and new Packet. That's a good idea, I'll get back to you on that.

The reason I went with a Data Class over a NamedTuple is that, in my opinion, it doesn't make sense for the Packet class to be a tuple. It should never be iterated over, and it should never be unpacked using tuple unpacking.

This gives the false sense that the object is immutable, while the sensor data is actually stored in a dictionary channel_readings, which is still mutable.

That's a good point, and I'll keep that in mind. Hopefully though, if I'm about to write packet.channel_readings[channel] = x, it'd set off warning bells loud enough to prevent this from being an issue haha.

should use @dataclass(slots=True)

I'll look into that, thanks.

Storing attributes in a dict is a bit iffy to me. Dict's are hash tables that take up a lot of memory, and reading/writing to them is another layer of indirection, compared to a tuple.

True, but I'm not sure how we could store the channel readings such that each device manager can set its own attributes, and the scope widget can find the readings no matter which device manager the packets come from. Maybe the device manager could return a class object to the scope widget? Seems overly complicated.

seasonedfish commented 1 year ago

Comparing the old (d566268f8dc1aecf170a0405c106ea0617e149ac) and current (ecd71d3db737cabeb5a9a6894aadeb83a7fdca36) versions, the average packet throughput is nearly the same, while the average framerate is slightly lower (156 fps vs 149 fps) on the current version.

Curiously, on the current version, changing Packet to inherit from typing.NamedTuple (and removing the @dataclass annotation) also yields an average framerate of 149 fps. So, using Data Classes seems to be alright–the bottleneck should be somewhere else.