Open joshuaabenazer opened 1 year ago
I did some digging into Batch synthesis and I have 1 concern with how it works.
So, creating a batch synthesis is straight forward, call the /batchsynthesis
endpoint with the input data, and that starts it. But, this endpoint does not return the audio data. It returns a JSON object with some information about the batch, one of the property is the batch ID.
To get the audio data, we are required to call /batchsynthesis/batchId
periodically to check if the audio has been created for that batch. This endpoint returns the URL for the ZIP that will contain our audio file.
Few approaches that we can take:
From @joshuaabenazer's points, we do (1), and depending on the estimated time of the audio file, we can conditionally use the existing TTS REST API or the Batch Synthesis API.
@10up/open-source-practice any thoughts on this?
I don't think we should build something that requires a user to stay on a particular page until the process completes. I think option 3 sounds the best to me, where the audio process is kicked off once the content is published and the processing of that happens behind the scenes, so doesn't matter if someone leaves the post or not. Would be great if the UI still updates automatically if someone happens to stay on the post screen the entire time.
I believe we do something similar right now with our PDF read functionality, so may be worth looking at that. I also know I've seen other APIs that require polling an endpoint to see if the process is complete, so having a solution that can be reused would be great.
I agree with the 3rd point as users do not have to wait in that approach (as @dkotter mentioned). Yet, I would like to suggest an additional aspect to consider, we should also notify users via email once their file is ready and attached to the post. This approach eliminates the need for users to frequently return to the post to check if the file is ready or not.
Describe the bug
Possible solutions:
Steps to Reproduce
Test out audio generation on a post containing lengthy content somewhere close to 1300-1500 words atleast, and notice that the audio generation process starts but then ends without any notification / errors and also does not generate the audio file.
Screenshots, screen recording, code snippet
No response
Environment information
No response
WordPress information
No response
Code of Conduct