klemie commented 1 month ago

Description

Currently we have a bottle neck in our infrastructure. We are taking advantage of files like an OS to get info from the labjack to the websocket. We've tried async python and it made thing much more worse and complicated. So we simplified to file reading and writing. Unfortunately we seem to have a deadlock issue, we try to read from the file at a rate higher than it can decode/write to it.

This ticket goes into one of two solutions:

Think of a new way to implement message passing (Maybe OS level)
Think of how to implement a deadlock system with files

Acceptance Criteria

[ ] message passing from labjack to ws is no longer a bottleneck

QA Notes

Linked issues

klemie commented 1 month ago

chatting with @jjderooy we think that geeksforgeeks.org/python-os-pipe-method might be a good solution

jjderooy commented 1 month ago

I did some testing, turns out that it is not a file bottleneck. In my experiment, generator.py produces random json strings similar to the instrumentation setup, and writes those to a log file instrumentation_data.txt. It also continuously overwrites a tmp.txt file with the latest json data which is continuously read by a seperate file called reader.py. reader.py logs the data read from tmp.txt to read_data.txt.

When calling reader.py, then generator.py from separate python instances, with generator.py writing at about 1000Hz, reader.py does not miss a single line written to tmp.txt. This was checked by comparing instrumentation_data.txt with read_data.txt using diff. They both match.

I think the real bottleneck is either websocket related, or plotting related on the UI frontend. I don't know much about either of them, but we can at least test the websocket throughput by just logging the data to a file rather than trying to display it.

That isn't to say that we shouldn't use pipes as Kris suggested, just that to solve our bottleneck we need to look elsewhere and that should be the priority before refining the file reading spaghetti.

generator.py

import json
import random
import time

with open('instrumentation_data.txt', 'a') as file:
    for i in range(0,10000):

        data = {f"PT_{i:02}": random.randint(0, 10000) for i in range(1, 11)}
        json.dump(data, file)
        file.write('\n')

        with open('tmp.txt', 'w') as tmp:
            json.dump(data, tmp)
            tmp.write('\n!')

        time.sleep(0.001)

reader.py

import json
import time

# read_data.txt should be a copy of instrumentation_data.txt
with open ('read_data.txt', 'w') as file:

    last_line = ""

    while True:
        with open('tmp.txt', 'r') as tmp:

            lines = tmp.readlines()

            if len(lines) > 1 and lines[0] != last_line:
                last_line = lines[0]
                file.write(lines[0])

jjderooy commented 1 month ago

@klemie could you test to see if it is a websocket, file io or async bottleneck by commenting out different parts of this code in wss.py?

    async def __instrumentation_handler(self, websocket):

        while True:
            with open('instrumentation/tmp.txt', 'r') as file:
                lines = file.readlines()
                if len(lines) > 1:
                    await websocket.send(json.dumps({
                        "identifier": "INSTRUMENTATION",
                        "data": json.loads(lines[0])
                    })) 
                    await asyncio.sleep(0.1)

To test websockets: comment out the websocket.send and replace with a print statement.
To test file io: comment out with open('instrumentation/tmp.txt', 'r') as file: and replace lines with a dummy string of json of approximately equivalent length.
To test asynchio, delete everything in the async function and replace it with a print statement and see how fast it'll rip.

These aren't perfect tests but maybe they'll give enough idea of where the slowness is coming from.

UVicRocketry / PDP-Monitoring-System