SnowySailor commented 9 months ago

Problem

If a website wants to provide a specific piece of context data to Yomitan's template renderer, it can only use the url or title which is not ideal for all types of data.

As an example, for this book reading website I am writing with a few other people we have specific data that we would like to make accessible to Yomitan templates (information about the source of the target word, the volume, etc.)

Solution

Allow a website to define an object in global scope (such as YomitanCustomContext) that Yomitan will look for and inject into the context variable that gets fed into the template renderer. This will allow a website to provide arbitrary data to Yomitan that can be made available in the template under the context variable. As an example, a website could define:

const YomitanCustomContext = {
    author: "Bob",
    editor: "Jane",
    article: {
        category: "grammar",
        seeAlso: ["https://website.com/grammar/1", "https://website.com/grammar/2"]
    }
}

And Yomitan would check to see if that variable is defined, then add it as context.custom before rendering the requested template.

I am willing to work on this if it's approved and if someone could give me a few files to start looking at just so I don't have to dig too deep into things since this seems like a relatively simple addition.

djahandarie commented 9 months ago

Hmm, I worry a bit about the security perspective. From two angles: 1. how do we protect against malicious input, 2. how do we avoid surprising the user that suddenly the extension is behaving differently on some website

Do you have thoughts regarding either of these?

SnowySailor commented 9 months ago

Since the data would only be available through the new context.custom variable, existing template functionality should not change. Someone would have to specifically write a template to make use of the new data, so they would be expecting for there to be different behavior on some sites.

From a security standpoint, I don't think there's that much risk since handlebars.js can't actually execute any code and including a reference to a variable shouldn't allow anything to execute either. Do you have a specific example of some input that could target a user and cause problems?

djahandarie commented 9 months ago

Got it, that makes sense overall, thanks. We probably need to do some very basic input sanitization (like have a max length etc) but probably not much.

The "problems" I'm imagining would be more along the lines of someone just trolling and injecting some really long, explicit, or layout-breaking text. Not sure what Anki would render either, would someone be able to inject some HTML, which could then do something like reference an external image, which would then leak the users IP to the attacker?

SnowySailor commented 9 months ago

Yeah I did consider the trolling thing where someone could just put like 50MB of text into it, but like you said a max length check could mitigate that. As for the HTML injection, we could run some sort of html stripper on all the string values in the object so that only raw strings can be allowed through. Maybe something like this?

djahandarie commented 9 months ago

Seems like a good start! But I don't really know how Anki works in detail in order to be sure that is sufficient, are there other special types of syntax or something that might be exploitable by an attacker?

Casheeew commented 9 months ago

Does any existing extensions allow websites to inject their own context into them?

SnowySailor commented 9 months ago

As far as I know Anki only parses html in its fields and can't execute Javascript. The only potential issue I could think of is with [sound: ] references. Usually these reference files in your media directory, but it's possible that they could be formatted like [sound:https://badwebsite.com/] and it might make a request to that website. However that same thing could be done with context.document.title nowadays, so it's not opening up a new attack vector that doesn't already exist. And you could probably just add a value.replace('[sound:', '') or something in there to break it.

djahandarie commented 9 months ago

I wonder if we should load the custom content into a value keyed by the domain name which is presenting them. Like context.custom['some.domain']. So then at least one malicious websites can't start injecting stuff in the custom fields that another website uses and a user has in their template. That way, when the user is modifying their templates, they are making the explicit decision to 'trust' a website, without needing to trust all websites.

SnowySailor commented 9 months ago

I actually had the same thought. My only hesitation was maybe it wouldn't be great from a UX perspective, but it would definitely help since you'd basically be required to enumerate the sites that you trust as a sort of whitelist. That would kinda be expected anyway since each site will provide a (probably) unique custom context, so the user will have to write unique template code for each specific website regardless.

djahandarie commented 9 months ago

Okay, in that case I think I'm fine with this idea, as it does indeed seem useful for websites that want to build custom "integrations" with yomitan, as long as we do the santization/scoping we discussed! :+1:

SnowySailor commented 9 months ago

Great, thanks! I'll look at getting started on that sometime this weekend.

adamjsturge commented 9 months ago

I'm not exactly sure how JavaScript would handle this because in the example I provide. It outputs what looks like a script tag to me. It might be sanitized but it will be worth testing with this payload in mind

function strip(html){
   let doc = new DOMParser().parseFromString(html, 'text/html');
   return doc.body.textContent || "";
}
var thing = strip("Just text <<a>script>alert('ciao');<</a>/script>");
console.log(thing)

SnowySailor commented 9 months ago

It's worth noting that this html injection is currently an issue with the context.document.title as well, so coming up with a solution is probably worth doing. I tested it by setting the page title to <script>alert(1)</script> and then adding a card to anki with the {document-title} field and I got a popup with 1 in Anki when I opened the card.

As a solution to the above, we could just loop until there's no more html to strip. It's potentially O(n) but if we have a limit on the total context length then it's not that bad.

function strip(html){
   let doc = new DOMParser().parseFromString(html, 'text/html');
   return doc.body.textContent || "";
}
var thing = "Just text <<a>script>alert('ciao');<</a>/script>"
while (strip(thing) !== thing) {
    thing = strip(thing);
}
console.log(thing);

This will just iterate until there's no more html to remove.

toasted-nutbread commented 9 months ago

Something that is potentially easier and less susceptible to bad actors is to read an attribute from an element. For example, document.documentElement.dataset.yomitanCustomData, and treat it as JSON that has to be decoded. This removes the need for cross-JS-execution-world detection, which must be handled differently on different browsers, and guarantees serializability. It can also be made synchronous.

yomitanCustomData is also just an example, but I wonder if given the nature of how people have all sorts of forks with differing names, if it should be more generalized like ankiExportCustomData. The structure of the object should also be something that has a JSON schema that we may be able to update/extend in the future. (There is a similar idea with the custom audio sources, where it can return a JSON object which must match a schema.)

yomidevs / yomitan

Allow webpages to provide custom context data in addition to "title" #519

Problem

Solution