guidance-ai / guidance

A guidance language for controlling large language models.
MIT License
18.74k stars 1.03k forks source link

attribute caching not thread-safe #1031

Open mmoskal opened 5 days ago

mmoskal commented 5 days ago

The bug When I run guidance on multiple schemas in multiple threads I get an exception.

To Reproduce

import guidance
import json
from concurrent.futures import ThreadPoolExecutor, as_completed

schema = """
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://json.schemastore.org/bpkg.json",
  "properties": {
    "name": {
      "description": "Where the dependency is located in `deps/`.\\n\\nSee more: https://github.com/bpkg/bpkg#name",
      "type": "string",
      "default": ""
    },
    "version": {
      "description": "The current version of the dependency.\\n\\nSee more: https://github.com/bpkg/bpkg#version-optional",
      "type": "string",
      "default": "v0.1.0"
    },
    "description": {
      "description": "Human-readable description of the functionality of the package.\\n\\nSee more: https://github.com/bpkg/bpkg#description",
      "type": "string",
      "examples": ["Terminal utility functions"]
    },
    "global": {
      "type": "string",
      "default": "",
      "description": "Whether the package is only intended be installed as a global script. Allows the omission of the `--global` flag when installing.\\n\\nSee more: https://github.com/bpkg/bpkg#global",
      "examples": ["true"]
    },
    "install": {
      "type": "string",
      "description": "Shell script used to invoke in the install script. Required if package is being installed as a global script.\\n\\nSee more: https://github.com/bpkg/bpkg#install-1",
      "default": "make install",
      "examples": ["make install"]
    },
    "scripts": {
      "description": "An array of scripts to install into a project. See more: https://github.com/bpkg/bpkg#scripts",
      "type": "array",
      "items": {
        "type": "string",
        "examples": ["script.sh"]
      }
    },
    "files": {
      "description": "An array of non-script files to install into a project. See more: https://github.com/bpkg/bpkg#files-optional",
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "dependencies": {
      "description": "Hash of dependencies of this project. Use either a tagged release identifier or `master`.\\n\\nSee more: https://github.com/bpkg/bpkg#dependencies-optional",
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    },
    "dependencies-dev": {
      "description": "Hash of development dependencies of this project. Use either a tagged release identifier or `master`.\\n\\nSee more: https://github.com/bpkg/bpkg#dependencies-dev-optional",
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    },
    "commands": {
      "description": "A hash of runnable commands for `bpkg run`.\\n\\nSee more: https://github.com/bpkg/bpkg#commands-optional",
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    },
    "commands-description": {
      "description": "A hash of descriptions for each command in `commands`.\\n\\nSee more: https://github.com/bpkg/bpkg#commands-description-optional",
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    }
  },
  "required": ["name", "description", "global", "install", "scripts"],
  "type": "object"
}
"""

def run_one(schema):
    return guidance.json("hi", schema=schema).ll_serialize()

def main():
    print("Starting")
    n_workers = 10
    j = json.loads(schema)
    with ThreadPoolExecutor(max_workers=n_workers) as executor:
        all_schemas = [j] * 100
        futures = [executor.submit(run_one, schema) for schema in all_schemas]

        for future in as_completed(futures):
            res = future.result()
            print(len(json.dumps(res)))

main()

The result is:

Starting
9775
9775
9775
...
9775
9775
Traceback (most recent call last):
  File "/Users/mimoskal/ai/guidance-ws/tmp/fail.py", line 102, in <module>
    main()
  File "/Users/mimoskal/ai/guidance-ws/tmp/fail.py", line 98, in main
    res = future.result()
          ^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/llg/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/llg/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/opt/homebrew/Caskroom/miniconda/base/envs/llg/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mimoskal/ai/guidance-ws/tmp/fail.py", line 87, in run_one
    return guidance.json("hi", schema=schema).ll_serialize()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mimoskal/ai/guidance-ws/guidance/guidance/_grammar.py", line 228, in ll_serialize
    return {"grammars": LLSerializer().run(self)}
                        ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mimoskal/ai/guidance-ws/guidance/guidance/_grammar.py", line 1178, in run
    self.run_grammar(grammar.body)
  File "/Users/mimoskal/ai/guidance-ws/guidance/guidance/_grammar.py", line 1163, in run_grammar
    self.process(node)
  File "/Users/mimoskal/ai/guidance-ws/guidance/guidance/_grammar.py", line 1137, in process
    if node.value is None:
       ^^^^^^^^^^
  File "/Users/mimoskal/ai/guidance-ws/guidance/guidance/_grammar.py", line 262, in value
    raise ValueError("DeferredReference does not have a value yet")
ValueError: DeferredReference does not have a value yet

System info (please complete the following information):

cc @hudson-ai

hudson-ai commented 5 days ago

Thanks @mmoskal, will look into it