leanprover-community / lean-client-python

Python talking to the Lean theorem prover
Apache License 2.0
39 stars 6 forks source link

Sometimes only half a message is in the buffer #1

Closed jasonrute closed 4 years ago

jasonrute commented 4 years ago

If a message is large, the stdout buffer of the lean server may only contain half a message. Here is code to reproduce the problem:

from pathlib import Path

import trio # type: ignore
from lean_client.trio_server import TrioLeanServer

async def main():
    async with trio.open_nursery() as nursery:
        server = TrioLeanServer(nursery, debug=False)
        await server.start()

        for i in range(1000):
            path = f'/tmp/lean_tmp_{i}.lean'
            with open(path, 'w') as f:
                f.write(f"example : {i}={i} :=\nbegin end")

            await server.full_sync(path)
            state = await server.state(path, 2, 0)
            print(state)
        nursery.cancel_scope.cancel()

if __name__ == '__main__':
    trio.run(main)

The fix is to change the loops in receiver as follows:

    async def receiver(self):
        """This task waits for Lean responses, updating the server state
        (tasks and messages) and triggering events when a response comes."""
        if not self.process:
            raise ValueError('No Lean server')
        unfinished_message = b''
        async for data in self.process.stdout:
            lines = (unfinished_message + data).split(b'\n')
            unfinished_message = lines.pop()  # usually empty, but can be half a message
            for line in lines:
                resp = parse_response(line.decode())
                if self.debug:
                    print(f'Received {resp}')
                if isinstance(resp, CurrentTasksResponse):
                    self.current_tasks = resp.tasks
                    if not resp.is_running:
                        self.is_fully_ready.set()
                elif isinstance(resp, AllMessagesResponse):
                    self.messages = resp.msgs
                if hasattr(resp, 'seq_num'):
                    self.responses[resp.seq_num] = resp
                    self.response_events[resp.seq_num].set()
jasonrute commented 4 years ago

I would make a PR, but I'm not really sure how to do that when I don't have permission to the repo.

PatrickMassot commented 4 years ago

The standard way to PR on GitHub is to fork the repo and open the PR from your fork. But I'll move the repo and give you access anyway.

PatrickMassot commented 4 years ago

If we start making this experiment more robust, we should also add tests.

jasonrute commented 4 years ago

The standard way to PR on GitHub is to fork the repo and open the PR from your fork. But I'll move the repo and give you access anyway.

Yeah, at work we use GitHub but not forks, so I didn't realize that is how you do it. Now that I have permission, I'll just work directly in branches on this repo.