cs-util-com / cscore

cscore is a minimal-footprint library providing commonly used helpers & patterns for your C# projects. It can be used in both pure C# and Unity projects.
https://cs-util-com.github.io/cscore/
Apache License 2.0
183 stars 31 forks source link

OpenAI API Extensions #107

Closed cs-util closed 2 months ago

cs-util commented 11 months ago

Add support for OpenAIs latest features, e.g.:

GPT4V Image Upload support

  1. First use the already existing DALL E API to generate an image (eg “of a dog”)
  2. Then use the GPT4V API ( https://platform.openai.com/docs/guides/vision ) to upload that image again and assert that the word “dog” is contained in the response
  3. The already added json response support (see usage in ExampleUsage4_ChatGptJsonResponses) should be used for a few extended demo unit tests where the response from the image analysis is put into a structured json, eg an image analysis class that contains labels, maybe even bounding boxes etc 3.1. E.g.: "Generate a dog with a cowboy hat" and the structured response should contain both the labels dog and cowboy hat

TTS & STT Combined in a single xUnit test to make use of each other:

  1. Text to speech: See https://platform.openai.com/docs/guides/text-to-speech and https://platform.openai.com/docs/models/tts 1.1. Stores the audio as a file on disk, logs the path to it (for the developer to take a look) and to use it for the next step:
  2. Speech to text via https://platform.openai.com/docs/guides/speech-to-text 2.2. Takes as the input the audio file from step 1 and converts it back to text 2.3. Asserts that the original input text4 from step 1 is the same as the result (this verifies that the uploads, download, storing etc all work correctly)

Implementing handling of Streaming responses

Not only for the TTS endpoint but also the normal chat completions endpoints (that are already implemented, see eg test ExampleUsage3_ChatGpt) this would be useful for real time user interactions. The docu on streaming seems to be easy to understand, see eg: https://platform.openai.com/docs/api-reference/chat/create

Extending the json response examples

The structured responses now allow some extended agents, for example build a code generator demo for the following structured json responses:

  1. An action class AddCSharpCodeFile that contains fields such as projectFilePath, fileContent
  2. An action class EditCSharpCodeFile that contains also references to the parts of the file that should be replaced (references maybe in the form of line numbers or method names)?
  3. An action class AddUnitTestFile that contains fields such as CodeFileContentToTest
  4. Actions to generate the needed .csproj files and also the outer .sln file

Here it will be relevant to ensure the compilation of the generated code happens in a sandbox, so that the generated code cant accidentally delete all data on the developers disk etc. so maybe automatic compilation of this code could be another interesting Action to add but running the code or generated xUnit tests should only be done if such sandbox is possible somehow. Maybe by wrapping everything into a virtual container?

If the automatic execution of the generated unit tests in such an container becomes possible, then the next step would be to run Stryker.net to ensure the completeness and consistency of these tests to cover the generated code. This should probably ensure that the code and xunit tests convert to a result that is correct in the sense that it fulfills the input requirements.