alcinnz / Odysseus

Bridging the elementary OS and Web user experiences
https://odysseus.adrian.geek.nz
GNU General Public License v3.0
47 stars 12 forks source link

Add personalized recommendations #116

Closed alcinnz closed 5 years ago

alcinnz commented 6 years ago

One thing that frequently leads people to central silos is to discover interesting and/or entertaining videos, etc to idle away their time with. However those silos do not have their best interests at heart, because they're advertising companies. As such they inherantly favour certain brands over others, and they want to manipulate people.

It would be better for this sort of discovery to be implemented in a webbrowser (who's job it is to help you find, read, and control webpages), in part because they don't need to invade your privacy to do so. But it does take some cleverness for them to do so.

A good blogpost on this topic is https://jamey.thesharps.us/2018/07/10/private-secure-multiparty-histograms/.

The recommendations offered by Odysseus should not have already be in your browser history.


UI-wise I'd add a single link to the newtab page with a red shadow. And maybe offer a webpage of further recommendations.

alcinnz commented 6 years ago

While there's UI aspects specific to Odysseus, it'd be best for the logic to be it's own project. And ideally one that already exists, as there's a lot of self-contained complexity around recommendations even when we're not trying to preserve privacy.

But if someone needs a good name to help them start a project, I like Siren. The creatures who lure sailers to their deaths through their beautiful songs of knowledge.

alcinnz commented 6 years ago

Thinking more about it, the histogram algorithm @jameysharp described isn't enough. But it does address weaknesses in the algorithm I've previously sketched out for this based on MinHash. That is:

  1. Each participant selects the minimum hash from their bookmarked URLs.
  2. (NEW) The participants agree on a length for their hashes and constructs a privacy respecting histogram as described in the link above. This addresses weakpoints in the rest of this algorithm where people are alone in their cliques.
  3. (NEW) Each participant selects the smallest hash that's both in the histogram and their bookmarks.
  4. Each participant sends a small sampling of their visited pages to others in their clique (indicated by their minhash).
  5. These links sent by others that aren't already in their browser history are presented as personalized recommendations.

I think this algorithm does a good job at protecting privacy whilst probablistically approximating the same recommendation engines used by sites like YouTube. And I think that probablistic nature is vital, I want to avoid filter bubbles.

But I cannot vouch for it, and would love to find a cryptographer who can indicate whether I have something or not.

jameysharp commented 6 years ago

Ooh, now I see where you're going with histograms+minhash and I think there's something there! You're using the histogram for something resembling "private set intersection" between pairs of participants, or kind of a "private set union" with a count of the number of participants in each bucket. It's possible that pairwise private set intersection gets closer to what you want, although I'd also want anonymity in the last steps, which might suggest looking at "private information retrieval" literature… I'm still thinking about this.

alcinnz commented 6 years ago

My thought was that privacy could be addressed there by using flood networking to remove anything identifying where the recommendations come from.

alcinnz commented 5 years ago

Duplicate of #137 (or vice versa).