Azure / azure-rest-api-specs

The source for REST API specifications for Microsoft Azure.
MIT License
2.62k stars 5.04k forks source link

[Speech Services - Speech Analytics] API Review #28522

Open azure-sdk opened 6 months ago

azure-sdk commented 6 months ago

New API Review meeting has been requested.

Service Name: Speech Services - Speech Analytics Review Created By: Nate Ko Review Date: 04/02/2024 01:00 PM PT Release Plan: PR: https://github.com/Azure/azure-rest-api-specs/pull/28521 Hero Scenarios Link: Not Provided Core Concepts Doc Link: here

Description: The Ingestion Service is a new feature of Speech Services. Customers can use the service to register a storage account to enable automatic processing of files when new files are added to their blob storage account. The processing currently includes transcription of the audio files and post analytics call via webhook (eg. PromptFlow online endpoint).

The service adds /registrations API where customer configures information about their storage account, transcription behavior and webhook endpoint for post analytics. For authentication, the API supports cognitive services key and token. Prior to registering, customer needs to enable MI on the cognitive services resource and assign the roles (Storage Blob Data Contributor, Cognitive Services User, AzureML Data Scientist) for the service to access to customer's storage, makes a batch transcription call and call the (PromptFlow) analytics endpoint.

Detailed meeting information and documents provided can be accessed here For more information that will help prepare you for this review, the requirements, and office hours, visit the documentation here

azure-sdk commented 5 months ago

Meeting updated by Nate Ko

Service Name: Speech Services - Speech Analytics Review Created By: Nate Ko Review Date: 04/02/2024 01:00 PM PT Release Plan: 1257 PR: https://github.com/Azure/azure-rest-api-specs/pull/28521 Hero Scenarios Link: Not Provided Core Concepts Doc Link: here

Description: The Ingestion Service is a new feature of Speech Services. Customers can use the service to register a storage account to enable automatic processing of files when new files are added to their blob storage account. The processing currently includes transcription of the audio files and post analytics call via webhook (eg. PromptFlow online endpoint).

The service adds /registrations API where customer configures information about their storage account, transcription behavior and webhook endpoint for post analytics. For authentication, the API supports cognitive services key and token. Prior to registering, customer needs to enable MI on the cognitive services resource and assign the roles (Storage Blob Data Contributor, Cognitive Services User, AzureML Data Scientist) for the service to access to customer's storage, makes a batch transcription call and call the (PromptFlow) analytics endpoint.

Detailed meeting information and documents provided can be accessed here For more information that will help prepare you for this review, the requirements, and office hours, visit the documentation here

mikekistler commented 5 months ago

Notes from API Review 4/2/24

Recommend come to API Stewardship office hours to work a plan for this.

mhko commented 5 months ago

We used OpenAPI spec.

Changed to v0.2-preview. This versioning scheme is consistent with rest of Speech Services (batch) (eg. https://eastus.ingestion.speech.microsoft.com/v0.2-preview/registrations)

Speech Services uses POST for create. https://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription-create?pivots=rest-api#create-a-transcription-job. We opted for consistency.

Updated to

"trigger": {
"kind": "EventGrid|Polling"
"filter": null,
"systemTopicResourceId": "/subscriptions/2c2e6d10-4e48-40fd-8f4d-d9fb770d0c6d/resourceGroups/speechingestiontest/providers/Microsoft.EventGrid/systemTopics/systemtopicbyos"
},

Speech needs to get to a consistent versioning scheme and version all APIs together

We opted for versioning scheme consistent with Batch Transcription Service (GA)

Is there just one SDK or multiple SDKs for Speech?

AFAIK, there's handcrafted (Carbon - Rob Chambers) and generated (azure-rest-api-specs\specification\cognitiveservices\data-plane\Speech\BatchTextToSpeech) for Batch Transcription but I haven't seen the latter. Started an email thread with Oliver who owns Speech Services all up.

Are the SDKs Track 1 or Track 2 ?

Rob Chambers

Need to make this consistent with the rest of the GA service

Choices are made based on consistency.

Pull Request https://github.com/Azure/azure-rest-api-specs/pull/28888

mikekistler commented 4 months ago

Notes from follow-up review meeting w/ Mike Kistler & Jeff Richter:

The changes to address comments from the prior review look fine, but there needs to be uniform versioning across the service to comply with Azure's Versioning Policy. We'll let this go in this PR but should be fixed as soon as possible and definitely before any new GA.

azure-sdk commented 4 months ago

Meeting updated by Nate Ko

Service Name: Speech Services - Speech Analytics Review Created By: Nate Ko Review Date: 04/02/2024 01:00 PM PT Release Plan: 1257 PR: https://github.com/Azure/azure-rest-api-specs/pull/28888 Hero Scenarios Link: Not Provided Core Concepts Doc Link: here

Description: The Ingestion Service is a new feature of Speech Services. Customers can use the service to register a storage account to enable automatic processing of files when new files are added to their blob storage account. The processing currently includes transcription of the audio files and post analytics call via webhook (eg. PromptFlow online endpoint).

The service adds /registrations API where customer configures information about their storage account, transcription behavior and webhook endpoint for post analytics. For authentication, the API supports cognitive services key and token. Prior to registering, customer needs to enable MI on the cognitive services resource and assign the roles (Storage Blob Data Contributor, Cognitive Services User, AzureML Data Scientist) for the service to access to customer's storage, makes a batch transcription call and call the (PromptFlow) analytics endpoint.

Detailed meeting information and documents provided can be accessed here For more information that will help prepare you for this review, the requirements, and office hours, visit the documentation here