browsermt / bergamot-translator

Cross platform C++ library focusing on optimized machine translation on the consumer-grade device.
http://browser.mt
Mozilla Public License 2.0
330 stars 37 forks source link

Ignore elements with `translate=no` attribute #401

Open jelmervdl opened 2 years ago

jelmervdl commented 2 years ago

See https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/translate

I can make elements with this attribute skip into the skip-element function, which is used for ignoring <code> etc. But that might mess up sentences, as this attribute can also apply to named entities. Those would, when the skip-element trick is used, completely disappear from the sentence when it is submitted to the translator, which might not like missing words (or names).

kpu commented 2 years ago

Yeah in general the MT system would need to be trained with a placeholder to do this correctly.