MicrosoftTranslator / DocumentTranslation

Command Line tool and Windows application for document translation, a local interface to the Azure Document Translation service for Windows, macOS and Linux.
Other
154 stars 37 forks source link

is possible add .ipynb translation feature #131

Open johnfelipe opened 2 months ago

johnfelipe commented 2 months ago

04_Energy_Meters_Recognition_CNN.zip

i upload this example i think this feature will be good for all data scientist with multilanguage needs Pls tell me how can help

chriswendt1 commented 2 months ago

Hi @johnfelipe , this would be a "local file format". You can add a converter from .ipynb" to Markdown, or to HTML, then have the service translate Markdown or HTML, then convert back to .ipynb in your code. Follow the example for SRT files in the LocalFormats folder.

chriswendt1 commented 2 months ago

Hi @johnfelipe , if you could provide the logic of extracting the translatable elements of the .ipynb file format, that would help.

The currently implemented logic in local file formats is this:

  1. Determine what is translatable text inside the original file. Make sure you keep translatable sentences together.
  2. Pack the relevant non-translatable data into a structure, so that we can restore the original from it and encode it as comment in the Markdown.
  3. Translate the Markdown.
  4. Unpack non-translatable data after translation, restoring the original format.

If the .ipynb format contains more markup than translatable text, I would probably use a different logic of:

  1. Extract the translatable text, replacing it with an identifier in the original format and keep it locally.
  2. Save translatable as Markdown with the identifiers as comments.
  3. Translate the Markdown.
  4. In the file saved in step 1, replace the identifiers with the translated text.
johnfelipe commented 2 months ago

will be great if in here:

[image: SNAG-0140.png] we have option for notebook

Markdown is working good

El lun, 2 sept 2024 a las 11:56, Chris Wendt @.***>) escribió:

Hi @johnfelipe https://github.com/johnfelipe , if you could provide the logic of extracting the translatable elements of the .ipynb file format, that would help.

The currently implemented logic in local file formats is this:

  1. Determine what is translatable text inside the original file. Make sure you keep translatable sentences together.
  2. Pack the relevant non-translatable data into a structure, so that we can restore the original from it and encode it as comment in the Markdown.
  3. Translate the Markdown.
  4. Unpack non-translatable data after translation, restoring the original format.

If the .ipynb format contains more markup than translatable text, I would probably use a different logic of:

  1. Extract the translatable text, replacing it with an identifier in the original format and keep it locally.
  2. Save translatable as Markdown with the identifiers as comments.
  3. Translate the Markdown.
  4. In the file saved in step 1, replace the identifiers with the translated text.

— Reply to this email directly, view it on GitHub https://github.com/MicrosoftTranslator/DocumentTranslation/issues/131#issuecomment-2325102373, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADIWFCFQVQTHBF2O6XRKWLZUSKC5AVCNFSM6AAAAABNQA5VSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRVGEYDEMZXGM . You are receiving this because you were mentioned.Message ID: @.***>

chriswendt1 commented 2 months ago

Hi @johnfelipe , if you could provide the logic of extracting the translatable elements of the .ipynb file format, that would help.