MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.21k stars 21.36k forks source link

PDF support in Azure Translator synchronous document translation API #122088

Closed micander closed 3 months ago

micander commented 4 months ago

Hello, I am seeking to use the synchronous document translation API to perform PDF translation. I am unable to complete my API requests because the API always responds with {"code":"InvalidFormat","message":"The format parameter is not valid."}. I have tried various source and destination languages and various PDF files, but have had no success. I can translate other formats successfully such as DOCX.

The API documentation doesn't explicitly confirm that the synchronous API supports PDF translation. Is the feature unimplemented, or just currently broken? I have seen another person asking about the same issue on the community support forum a month ago. Thank you.

Example API call: curl -i -X POST "https://eelec-translation-uswest.cognitiveservices.azure.com/translator/document:translate?sourceLanguage=en&targetLanguage=hi&api-version=2023-11-01-preview" -H "Ocp-Apim-Subscription-Key:<omitted>" -H "X-ClientTraceId:pdf-translation-issue-b4e3ae21" -F "document=@blank.pdf;type=application/pdf" -o "out.pdf"

Response:


content-type: application/json; charset=utf-8
access-control-expose-headers: X-RequestId,x-ms-request-id,X-RequestId-Forwarded,X-Metered-Usage
x-requestid: 944d44da-dfca-4ed3-9a93-11a68fd787f5
x-ms-request-id: 944d44da-dfca-4ed3-9a93-11a68fd787f5
x-ms-error-code: InvalidRequest
strict-transport-security: max-age=31536000; includeSubDomains; preload
apim-request-id: 944d44da-dfca-4ed3-9a93-11a68fd787f5
x-content-type-options: nosniff
x-ms-region: West US
date: Mon, 29 Apr 2024 04:38:09 GMT
{"error":{"code":"InvalidRequest","message":"The format parameter is not valid.","target":"ContentType","innerError":{"code":"InvalidFormat","message":"The format parameter is not valid."}}}```

---
#### Document Details

⚠ *Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.*

* ID: d3054f8a-0f23-5a0d-0bff-e89364cb71f1
* Version Independent ID: 0566409f-75a3-6a4c-5076-2cd971fde994
* Content: [Synchronous translation REST API guide - Azure AI services](https://learn.microsoft.com/en-us/azure/ai-services/translator/document-translation/reference/synchronous-rest-api-guide)
* Content Source: [articles/ai-services/translator/document-translation/reference/synchronous-rest-api-guide.md](https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/ai-services/translator/document-translation/reference/synchronous-rest-api-guide.md)
* Service: **azure-ai-translator**
* GitHub Login: @laujan
* Microsoft Alias: **lajanuar**
micander commented 4 months ago

It appears that PDF it is not a supported format in the synchronous API after all: https://learn.microsoft.com/en-us/azure/ai-services/translator/document-translation/overview#synchronous-supported-document-formats

So this ticket becomes a report that the link to the list of supported formats does not go to the correct page. See here: https://learn.microsoft.com/en-us/azure/ai-services/translator/document-translation/reference/synchronous-rest-api-guide#request-body

The link "supported document formats" links to https://learn.microsoft.com/en-us/azure/ai-services/translator/language-support but it should link the the link above. I was not able to find this page on my own despite searching. This mis-link should be fixed.

Also, is it possible to know if PDF support is a planned feature for the synchronous API?

Thanks

PesalaPavan commented 4 months ago

@micander Thanks for your feedback! We will investigate and update as appropriate.

Naveenommi-MSFT commented 4 months ago

@micander Thank you for bringing this to our attention. I've delegated this to content author @laujan, who will review it and offer their insightful opinions.