Imperial-EE-Microsoft / co_op_translator

Easily automate multilingual translations for your projects with co_op_translator, powered by advanced LLM technology.
https://techcommunity.microsoft.com/t5/educator-developer-blog/localizing-github-repositories-with-llms/ba-p/4216434
MIT License
1 stars 2 forks source link

Azure OpenAI misinterprets markdown files with many links as XML #28

Closed skytin1004 closed 1 month ago

skytin1004 commented 2 months ago

Issue: Azure OpenAI misinterprets markdown files with many links as XML

When translating markdown documents, especially those with extensive markdown syntax like links (e.g., README files), Azure OpenAI sometimes misinterprets the file content as XML rather than markdown. This leads to translation errors and incomplete output.

Example:

In the Phi-3CookBook README file, which contains numerous markdown links, the Azure OpenAI model mistakenly processes the file as XML, resulting in translation issues.

Proposed Solution:

  1. Add a prompt: Introduce a more specific prompt for Azure OpenAI to better handle markdown syntax, ensuring that the model correctly interprets the content as markdown, not XML.

  2. Improve logic: Add logic to detect markdown files with heavy syntax usage (e.g., links, images) and adjust the processing method to prevent such misinterpretation.

By implementing these adjustments, we can prevent misinterpretation and ensure smoother translations of markdown documents with complex syntax.

This issue is related with #27