CanCLID / inject-jyutping

A browser extension that adds Cantonese pronunciation (Jyutping) on Chinese characters
https://chromewebstore.google.com/detail/inject-jyutping/lfgpgjkjglogbndlkikjgbbfoiofbdjp
BSD 2-Clause "Simplified" License
69 stars 9 forks source link

feat: listen content changes of nodes #1

Closed ambar closed 2 years ago

ambar commented 2 years ago

监听节点的新增和内容变化(通过 MutationObserver删除了节点的 lang 属性检测,我猜这没有用处,因为即使母语是日语,网页同样可以嵌入中文内容(以便进行学习 新 issue 中讨论比较好

laubonghaudoi commented 2 years ago

@ambar 唔係好明點解要刪個lang檢測,我嘅理解係如果係日語頁面就冇必要再標粵拼,因為冇人會用粵語發音讀啲日語漢字。@ayaka14732 你點諗?

ayaka14732 commented 2 years ago

多謝貢獻,不過可唔可以順便更新埋個版本號?如果合併咗呢個就會係個 0.2.6 版

PR 入面唔應該更新版本號,發佈嗰陣我嚟更新

graphemecluster commented 2 years ago

不過 @ambar 嘅 concern 似乎係好多網站會錯標或者漏標 lang attribute

laubonghaudoi commented 2 years ago

多謝貢獻,不過可唔可以順便更新埋個版本號?如果合併咗呢個就會係個 0.2.6 版

PR 入面唔應該更新版本號,發佈嗰陣我嚟更新

明嘞,所以而家就係要拆成幾個PR嚟改

graphemecluster commented 2 years ago

應該做個 benchmark,睇下咩方法快啲

graphemecluster commented 2 years ago

Benchmark:

function traverse(array, current) {
    if (["RUBY", "OPTION", "TEXTAREA", "SCRIPT", "STYLE"].includes(current.tagName)) {
        return;
    }
    for (const node of current.childNodes) {
        if (node.nodeType === Node.TEXT_NODE) {
            array.push(node);
        } else {
            traverse(array, node);
        }
    }
}

new Benchmark.Suite()
    .add("Custom traverse function", function () {
        const array = [];
        traverse(array, document.body);
    })
    .add("NodeIterator with filter function", function () {
        const nodeIterator = document.createNodeIterator(document.body, NodeFilter.SHOW_TEXT, {
            acceptNode(node) {
                if (["RUBY", "OPTION", "TEXTAREA", "SCRIPT", "STYLE"].includes(node.tagName)) {
                    return NodeFilter.FILTER_REJECT;
                }
                return NodeFilter.FILTER_ACCEPT;
            },
        });
        const array = [];
        let current;
        while ((current = nodeIterator.nextNode())) array.push(current);
    })
    .add("TreeWalker with filter function", function () {
        const treeWalker = document.createTreeWalker(document.body, NodeFilter.SHOW_TEXT, {
            acceptNode(node) {
                if (["RUBY", "OPTION", "TEXTAREA", "SCRIPT", "STYLE"].includes(node.tagName)) {
                    return NodeFilter.FILTER_REJECT;
                }
                return NodeFilter.FILTER_ACCEPT;
            },
        });
        const array = [];
        let current;
        while ((current = treeWalker.nextNode())) array.push(current);
    })
    .add("TreeWalker traverse function", function () {
        const treeWalker = document.createTreeWalker(document.body);
        const array = [];
        let current = treeWalker.nextNode();
        do {
            if (["RUBY", "OPTION", "TEXTAREA", "SCRIPT", "STYLE"].includes(current.tagName)) {
                do {
                    current = treeWalker.nextSibling();
                    if (current) break;
                    current = treeWalker.parentNode();
                } while (current);
            } else {
                if (current.nodeType === Node.TEXT_NODE) {
                    array.push(current);
                }
                current = treeWalker.nextNode();
            }
        } while (current);
    })
    .on("cycle", function (event) {
        console.log(event.target + "");
    })
    .on("complete", function () {
        console.log("The fastest is " + this.filter("fastest").map("name"));
    })
    .run({ async: true });

頭 3 個嘅速度幾乎一樣,最後一個嘅速度係前面嗰啲嘅大概 3 倍半。 但係兩個 traverse function 嘅結果似乎同兩個 filter function 有啲唔同……

laubonghaudoi commented 2 years ago

@graphemecluster 我冇睇明個 benchmark,説明咗乜?

graphemecluster commented 2 years ago

@laubonghaudoi 説明咗用 TreeWalker 的確係快過原先嗰段 code,但係會簡單複雜化;而用 filter function 嘅話的確係簡化咗段 code,但係又冇咩 performance 嘅優越性,甚至慢咗(而且結果唔知點解唔同)。真係唔知點取捨。

樓主段 code 的確係用咗 TreeWalker,但係佢嘅寫法會導致明明個 element 畀人 filter 咗,但係啲 children 又冇 filter 到。

而且我哋仲未考慮 lang 屬性,到頭來係咪用返原先嘅方法唔郁段 code 最好要問下 @ayaka14732 嘅意見。

ambar commented 2 years ago

由于最终 treeWalker 优势不明显,revert 了。lang 处理也保留了 —— 但它的处理也会带来不利的地方,在 MutationObserver 中需要向上查找。

ayaka14732 commented 2 years ago

而且我哋仲未考慮 lang 屬性,到頭來係咪用返原先嘅方法唔郁段 code 最好要問下 @ayaka14732 嘅意見。

我冇意見,不如交畀你決定啦~