Opentrons / opentrons

Software for writing protocols and running them on the Opentrons Flex and Opentrons OT-2
https://opentrons.com
Apache License 2.0
422 stars 178 forks source link

bug: Repeatedly running protocols causes "can't start new thread" error #8087

Open SyntaxColoring opened 3 years ago

SyntaxColoring commented 3 years ago

[This issue was originally written in reference to the beta HTTP API, but it's probably an issue even if you're just uploading protocols through the v4.x Opentrons App like normal.]

Overview

If you upload and run protocols over and over again through the beta HTTP API, the OT-2 will eventually start returning internal errors that suggest it has run out of threads.

This ticket is adapted from a report from Declan Jones, an HTTP API beta tester. (Thank you!)

Steps to reproduce

Run an infinite loop based on the HTTP API beta example code that:

  1. creates a protocol
  2. creates a session
  3. waits and checks to see if the session is in the loaded state
  4. runs the protocol when loaded
  5. waits and checks if the protocol is in the finished state
  6. then deletes the session
  7. deletes the protocol
  8. waits 5 minutes

Current behavior

Around the 68th run, you will start to get RuntimeErrors like:

{
    "links": {
        "self": {
            "href": "/protocols/OT-sterility-test"
        },
        "protocols": {
            "href": "/protocols"
        },
        "protocolById": {
            "href": "/protocols/{protocolId}"
        }
    },
    "data": {
        "id": "OT-sterility-test",
        "protocolFile": {
            "basename": "OT-sterility-test.py"
        },
        "supportFiles": [],
        "lastModifiedAt": "2021-07-13T07:10:36.748014+00:00",
        "createdAt": "2021-07-13T07:10:36.748053+00:00",
        "requiredEquipment": {
            "pipettes": [],
            "labware": [],
            "modules": []
        },
        "metadata": {
            "name": "None",
            "author": "None",
            "apiLevel": "2.9"
        },
        "errors": [
            {
                "type": "RuntimeError",
                "description": "“can't start new thread”"
            }
        ]
    }
}

Expected behavior

Since you clean up the protocol and session resources properly, you should be able to run protocols infinitely without hitting this threading error.

amitlissack commented 3 years ago

I am pretty sure that this is caused by a ThreadManager thread leak as a result of robot_server.service.protocol.analyze._simulate_protocol.

Relates to #7302

SyntaxColoring commented 3 years ago

Broadening the scope of this issue since it doesn't seem specific to the HTTP API.