RageAgainstThePixel / com.openai.unity

A Non-Official OpenAI Rest Client for Unity (UPM)
https://openai.com
MIT License
463 stars 64 forks source link

WebGL Build Local Cache #130

Closed SebastianBlandon closed 10 months ago

SebastianBlandon commented 11 months ago

Bug Report

Overview

In WebGL Build the file storage not working and is not possible debug the process

Test

System.IO.DirectoryNotFoundException: Could not find a part of the path "/tmp/download_cache/Alloy-20231122T105455.mp3".
  at System.IO.FileStream..ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, System.IO.FileShare share, System.Int32 bufferSize, System.Boolean anonymous, System.IO.FileOptions options) [0x00000] in <00000000000000000000000000000000>:0 
--- End of stack trace from previous location where exception was thrown ---
StephenHodgson commented 11 months ago

https://github.com/RageAgainstThePixel/com.utilities.rest/issues/51

StephenHodgson commented 11 months ago

Due to CORS policy of image storage in local WebGL builds you will get the generated image's URL however it will not be downloaded using UnityWebRequest until you run it out of localhost, on a server.

SebastianBlandon commented 11 months ago

I deployed it on a server and I get the same error, the problem is in the configuration for the construction of the WebGL compiled?

raphik12 commented 11 months ago

I've had the same issue and solved it by using streaming instead of saving a local temporary file. Here's my downloading coroutine, calqued on this package's. The most important line is ((DownloadHandlerAudioClip)www.downloadHandler).streamAudio = true;

Of course my code is doing extra work that this package is already managing, so a fix in the package would be much shorter.

public IEnumerator CreateSpeechAsync(SpeechRequest request, Action<AudioClip> callBack, CancellationToken cancellationToken = default)
{
    var audioFormat = request.ResponseFormat switch
    {
        SpeechResponseFormat.MP3 => AudioType.MPEG,
        _ => throw new NotSupportedException(request.ResponseFormat.ToString())
    };
    var ext = request.ResponseFormat switch
    {
        SpeechResponseFormat.MP3 => "mp3",
        _ => throw new NotSupportedException(request.ResponseFormat.ToString())
    };

    string fileName = $"{request.Voice}-{DateTime.UtcNow:yyyyMMddThhmmss}.{ext}";
    string speechEndpoint = "https://api.openai.com/v1/audio/speech";

    using UnityWebRequest www = new UnityWebRequest(
        speechEndpoint, "POST", (DownloadHandler) new DownloadHandlerAudioClip(speechEndpoint, audioFormat), (UploadHandler) null);
    ((DownloadHandlerAudioClip)www.downloadHandler).streamAudio = true;

    // json handling
    www.SetRequestHeader("Content-Type", "application/json");
    JsonSerializerSettings jsonSerializationOptions = new JsonSerializerSettings
    {
        NullValueHandling = NullValueHandling.Ignore,
        DefaultValueHandling = DefaultValueHandling.Ignore,
        Converters = new List<JsonConverter>
        {
            new StringEnumConverter(new SnakeCaseNamingStrategy())
        },
        ContractResolver = new EmptyToNullStringContractResolver()
    };
    string payload = JsonConvert.SerializeObject(request, jsonSerializationOptions);
    byte[] jsonBytes = System.Text.Encoding.UTF8.GetBytes(payload);
    www.uploadHandler = new UploadHandlerRaw(jsonBytes);

    // headers & parameters
    RestParameters restParameters = new(API.DefaultRequestHeaders);
    foreach (var (k,v) in restParameters.Headers)
    {
        www.SetRequestHeader(k,v);
    }
    yield return www.SendWebRequest();

    if (www.result == UnityWebRequest.Result.ConnectionError || www.result == UnityWebRequest.Result.ProtocolError)
    {
        callBack(null);
    }
    else
    {
        AudioClip audioClip = null;
        try
        {
            audioClip = DownloadHandlerAudioClip.GetContent(www);
        }
        catch (Exception ex)
        {
            Debug.LogError($"failed to getcontent audio clip {ex.Message}");
            audioClip = null;
        }

        if (audioClip != null)
        {
            audioClip.name = fileName;

            callBack(audioClip);

            //UnityEngine.Object.Destroy(audioClip) must be called at some point after the callback call to release the asset
        }
        else
        {
            callBack(null);
        }
    }
}
StephenHodgson commented 11 months ago

I would encourage you to update this to async/await instead of coroutine, since the rest of the package uses it.

StephenHodgson commented 11 months ago

Also worth pointing out that the base Rest library already has support for streaming mp3s in this manor

raphik12 commented 11 months ago

In addition, this coroutine only copies part of the restParameters, the headers. It works, but it's not really robust. Again, it's because this is a quick fix calqued on this package's CreateSpeechAsync method.

raphik12 commented 11 months ago

I see that in Rest the StreamAudioAsync method has a comment on the downloadHandler.streamAudio = true; line: // BUG: Due to a Unity bug this is actually totally non-functional... https://forum.unity.com/threads/downloadhandleraudioclip-streamaudio-is-ignored.699908/ So I guess Rest's StreamAudioAsync doesn't work, maybe because the downloadHandler isn't initialized the same way?

StephenHodgson commented 11 months ago

It does works for mp3 but the implementation is the same for the most part as your example above.

It does NOT work for mp3s of unknown length, which was impacting the elevenlabs plugin.

StephenHodgson commented 11 months ago

Anyway, the real fix needs to be done in the base rest package, as the cache mechanism implementation is currently a no op.

This also impacts image generation as well.