openai / openai-python

The official Python library for the OpenAI API
https://pypi.org/project/openai/
Apache License 2.0
22.49k stars 3.13k forks source link

Set `jiter` as optional dependency to support `pyodide` (~3 lines diff) #1782

Open CNSeniorious000 opened 1 week ago

CNSeniorious000 commented 1 week ago

Confirm this is a feature request for the Python library and not the underlying OpenAI API.

Describe the feature or improvement you're requesting

Pyodide currently don't support jiter. openai-python use it for partial json parsing. But it is just used in 2 lines.

If we move jiter into optional-dependencies, we will be able to use openai-python in pyodide runtime.

Once upon a time, httpx is blocking openai from pyodide too. But that issue is already resolved. The only barrier is jiter now.

About openai, pyodide and httpx

I've checked these issue: - #815 - #960 At that time, openai is not compatible with pyodide because of `httpx`. Now there even exist a [`pyodide-httpx`](https://pypi.org/project/pyodide-httpx/) to patch httpx in pyodide

If we can use openai in pyodide, it will be possible to provide interactive python demos in the browser for prompt engineering frameworks, which I think is a great feature to have.

Additional context

Another way is to use another package to parse partial json. There is a package called partial-json-parser which did the almost same job as jiter.from_json, but also providing more flexibility on specifying which types are allowed to be incomplete. And it keep types too. For the latter one, let me present an example:

from jiter import from_json
from_json(b'{"a": [1', partial_mode=True)  # {'a': [1]}
from_json(b'{"a": [1.', partial_mode=True)  # {'a': []}

In the example above, tokens increase but parsed value disappeared.

Plus, partial-json-parser's API is consistent among its Python/JavaScript/Go implementations.

I tried a bit to replace jiter by partial_json_parser:

https://github.com/openai/openai-python/commit/7419b7059f5e024aa9b87942b47ecff80a6b32b5#diff-08dc4c3c3e8e145eec1fd0b6a4577f0bce73567d4da3460e08dd4c2d34b27915

RobertCraigie commented 1 week ago

Thanks for the report, would it be enough to just lazily import jiter instead? Or does simply listing it in dependencies cause issues?

Additionally, have you opened an issue with jiter to see if the pydantic team can do anything to make it Pyodide compatible? I'm sure they'd be interested in making that work.

CNSeniorious000 commented 1 week ago

Listing it in dependencies should still cause issues. Because installing openai will try to install its dependencies, and jiter is non-optional dependency of openai. Pyodide only supports pure-python wheels and emscripten wheels, but jiter don't have any of these, so resolving jiter will fail, causing failure on resolving openai.

Thanks for advices. I've opened an issue with jiter:

anointingmayami commented 1 week ago

This is great.

Integrating with Pyodide would allow the OpenAI library to be used in web applications without needing a backend server. This could open up new opportunities for educational tools, interactive demos, and user-driven applications that leverage the OpenAI API.

The innovation could provide significant benefits, especially for web-based applications, but it requires a thoughtful approach to assess compatibility, potential costs, and benefits. The total cost will vary based on the project's scope, the existing codebase's complexity, and the resources available for development. Planning and phased implementation may be beneficial to manage these efforts effectively.

Give us some time to review this update.

Furthermore, could you specify how you would like to use Pyodide in OpenAI?

CNSeniorious000 commented 1 week ago

There are some personal factors, such as I am more familiar with Python, so most of my prompt engineering is written in Python. I am working on an educational tools and interactive demos platform, so I integrates with pyodide a lot. Another reason is the ability to use Pyodide as a code interpreter, providing it as a tool for the LLM. (Here is a prototype)

And, I am planning to provide interactive docs for every supported python library, with LLM copilot built-in, which is a long-term target.

anointingmayami commented 1 week ago

This happen to be the engagement we are looking for in this AI Generation. According to the report release by IBM Seven Bets, we are considering Gen AI as the optimal breakthrough in the AI commercialization to build sustainability and profitability using Data to People (D2P) Model.