10up / classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence.
https://classifaiplugin.com
GNU General Public License v2.0
578 stars 53 forks source link

Update the Azure APIs to their latest versions #553

Closed dkotter closed 8 months ago

dkotter commented 1 year ago

Is your enhancement related to a problem? Please describe.

Ideally we should be looking to update any APIs we use to their latest versions on a regular basis. This issue is focused on any Azure APIs we use. The following is a list of the APIs we are using and the version.

For the Personalizer API, v1.0 is the latest (though there is a v1.1 in preview) so nothing needed there. Same for our Text to Speech API, we are currently using the latest version.

The Analyze Image, OCR, Read and Generate Thumbnail APIs are all under the same service (previously known as Cognitive Services Computer Vision, since renamed to Azure AI Vision). The latest released version of this API is v3.2, while there is a v4.0 public preview API.

Azure is pushing for everyone to use the new v4.0 public preview API but in researching this, there are currently some limitations that may hold us back. For instance, generating image captions or smart cropping are only available in a small set of regions in v4.0 (East US, France Central, Korea Central, North Europe, Southeast Asia, West Europe, and West US, East Asia).

There's also been quite a few changes to these APIs in v4.0, so will take some refactoring if we pursue these updates. For instance, all existing features we use, outside of reading content from PDFs, is now under a single Analyze API in v4.0. This will require some changes to how our code works to account for this.

That said, assuming we're okay with the region limitations, I'd like to pursue updating all of those to v4.0. If we're not okay with that, I think it would be ideal to get all of those on v3.2 (so just Analyze Image and Generate Thumbnail).

I tried updating to v3.2 of the Analyze Image API and while the results we get seem good, the confidence scores, at least for image captions, are lower, so that's something we would need to determine how best to handle (in using the Vision Studio tool, this seems to have been fixed in v4.0). Their docs even mention:

In general, we advise a confidence threshold of 0.4 for the Image Analysis 3.2 API and of 0.0 for the Image Analysis 4.0 API (preview).

If we decide to update to v4.0, here's tasks as I see them:

If we stick with v3.2, here's what we'll want to do:

Designs

No response

Describe alternatives you've considered

No response

Code of Conduct

kmgalanakis commented 1 year ago

@jeffpaul what should be our decision here? Move to v4.0 or stick to v3.2?

cc @dkotter

kmgalanakis commented 1 year ago

I've created a draft PR for this at https://github.com/10up/classifai/pull/559.

I verified that the confidence scores have been lowered. Judging by the tests I did what worked best for me was a score between 0.5 and 0.55. As far as the lowering of the confidence scores is concerned, I mostly see it as a matter of personal preference.

As a consequence, I would suggest that we leave the default option value for the scores as is and display a dismissable notification when we detect that an API version greater or equal to 3.2 and the selected confidence threshold is above 0.5-0.55.

I tried to create another PR with the update of the APIs to version 4.0 but I found it too difficult, considering the fact that I'm not that familiar with the codebase, and since from what I saw the endpoints have changed.

jeffpaul commented 1 year ago

I received an email from Microsoft Azure that Computer Vision 3.1 API will be retired on 13 September 2026 and to migrate our computer vision workloads to Computer Vision 3.2 API with these benefits:

Seems like we're well along on that path, but best that we continue to stay on top of the APIs we're using in ClassifAI to ensure we're more regularly updating the API versions in ClassifAI to stay as current as feasibly possible.