jelmervdl / translatelocally-web-ext

TranslateLocally for the Browser is a web-extension that enables client side in-page translations for web browsers.
https://addons.mozilla.org/en-GB/firefox/addon/translatelocally-for-firefox/
Mozilla Public License 2.0
65 stars 3 forks source link

Custom elements with shadow roots #38

Open jelmervdl opened 2 years ago

jelmervdl commented 2 years ago

The page https://www.suewag.de/privatkunden uses custom elements to an extreme. From the DOM root almost no text is visible, it is all hidden inside the shadow roots of the custom elements.

Example script to actually get all the text on the page (and a bit more because it doesn't differentiate between things like <div>, <custom-element> and <script>)

function *extractText(node) {
    const walker = document.createTreeWalker(node, NodeFilter.SHOW_ELEMENT | NodeFilter.SHOW_TEXT);

    while (walker.nextNode()) {
        if (walker.currentNode.nodeType === Node.ELEMENT_NODE) {
            if (walker.currentNode.shadowRoot)
                yield *extractText(walker.currentNode.shadowRoot)
        } else if (walker.currentNode.nodeType === Node.TEXT_NODE) {
            yield walker.currentNode.data;
        } else {
            throw new TypeError("What?")
        }
    }
}

texts = Array.from(extractText(document.body))
    .map(text => text.trim())
    .filter(text => text.length > 0);

Similarly, the InPageTranslation class would need to be updated to be aware of shadowRoot. I'm not sure whether DOM mutations propagate from shadowRoot to the parent fragment or eventually the document.body tree?

Ideally, we could use the accessibility tree directly to figure out text and translate that. That should contain all text that is relevant to translate, if pages are properly accessible. It is also something creators of custom elements can easily design for. But unfortunately nothing like that is being shipped yet.

There is the Accessibility Object Model, Phase 4 Draft which could be promising some day.

jelmervdl commented 2 years ago

Oh oh we might even be able to support closed shadow roots! https://developer.mozilla.org/en-US/docs/Web/API/Element/openOrClosedShadowRoot

jelmervdl commented 1 year ago

With #64 added we already have our own tree walker. Adding shadow dom support to that is now a lot easier 🎉