OPSN / MVP-discuss

A place to design a proof of concept implementation of Overlaid Personal Semantic Networks.
Apache License 2.0
1 stars 0 forks source link

People need a reason to want to use our initial implementation #4

Open oresmus opened 7 years ago

oresmus commented 7 years ago

... or they won't.

Possibilities include:

jmichelz commented 7 years ago

I wanted to try this process: http://zguide.zeromq.org/cs:chapter6#Simplicity-Oriented-Design Can we come up with a list of problems people have in this area, and come up with the simplest, most dramatic one? As an example I've read a million things online, but don't have a good record of them, showing the most important.

"We collect a set of interesting problems (by looking at how people use technology or other products) and we line these up from simple to complex, looking for and identifying patterns of use.

We take the simplest, most dramatic problem and we solve this with a minimal plausible solution, or "patch". Each patch solves exactly a genuine and agreed-upon problem in a brutally minimal fashion.

We apply one measure of quality to patches, namely "Can this be done any simpler while still solving the stated problem?" We can measure complexity in terms of concepts and models that the user has to learn or guess in order to use the patch. The fewer, the better. A perfect patch solves a problem with zero learning required by the user.

Our product development consists of a patch that solves the problem "we need a proof of concept" and then evolves in an unbroken line to a mature series of products, through hundreds or thousands of patches piled on top of each other."

jmichelz commented 7 years ago

Maybe something that sits and watches what you read online, then guesses the things that are most important (most surprising to you?) and records that as well. Maybe by time on a page relative to it's length (on the theory it takes you longer to process something important). Maybe by semantic analysis of the pages, what they have in common, what you're trying to figure out. It could take note when you share things too, but many things I don't remember to share.

When you go look over the OPSN it generated, you could tell it where it made mistakes and it could start to learn how to do a better job. Eventually when it's got an accurate picture of your interests, it could start showing things to you it predicts you would find important. Once the OPSN has a good amount of interesting content in it, you could share those things with other people in one shot instead of piecemeal.

oresmus commented 7 years ago

I'm responding just to the app idea in the prior comment. First let me paraphrase it to make sure I get the basic idea:

I think that's a great idea for an app, but very complex compared to a "minimal useful app that demonstrates OPSN".

To be specific, I know there are plenty of times when I explicitly indicate my interest in something (by emailing it, posting about it, making a bookmark, etc). Some of them are fast (make a bookmark). Whereas code to watch me and guess my interests is very complex. So I suspect that for a first useful app, it would be ok for the user to have to make explicit bookmarks or posts.

(And even if I'm wrong, the suggested app would need to be factored into something that comes up with the items to publish, and something that publishes them. I'm suggesting starting with the "coming up with items" part being much simpler. If it's too simple we can always improve it, but meanwhile it served to let us test the publishing part.)

oresmus commented 7 years ago

An idea that is also more complex (so probably not the first thing to try), but this reminded me to mention it, might be called scan a blog and its comments: the user could indicate to the software an interesting website or set of them, the software could scan everything from those sites (and maybe things they link to, to some small depth), and make all this into a graph of related stuff, and the user could later find interesting portions of that graph (it would directly help them browse interesting things) and publish them. (For example, I could point it to a few wordpress blogs, and it would know how to scan their posts and comments, and use phrase commonality (and explicit tags by blog author) to guess an initial graph. Or I could point it to my own private notesfiles.)

My sense is, this is much harder than "minimal", but still easier than "watch me and infer what I'm interested in". But even if it's equally hard, it's an interesting use case to keep in mind for the future.

oresmus commented 7 years ago

My reasoning on the idea I just mentioned is that an OPSN graph is only really interesting to browse once it has a bunch of stuff in it, but for initial users it won't be big enough to be interesting in that way, unless there is some way to generate an OPSN graph from some large amount of preexisting data they might already be interested in.

I often find new blogs and want to read some of their posts and lots of the comments on those posts -- having the posts and comments of the entire blog pre-organized in a way that approximated a good topic-categorized OPSN graph (even if half the links were bad guesses) would make my initial browsing of it more efficient, as well as allowing me to make my own annotations and categorization improvements and publish those (assuming the blog license terms allowed this) as a reasonably large chunk of OPSN data, of somewhat better quality than the raw data made by the tool first scanning the blog.

oresmus commented 7 years ago

As for your original problem, which I paraphrase as "lack of a good record of what you've already read and found interesting", here are some thoughts on that:

oresmus commented 7 years ago

All these ideas suggest ways of coming up with graphs, UIs for letting us navigate and improve those, and publishing small portions of them. They all depend on something basic for working with that kind of data, showing it to us, letting us publish some of it.

jmichelz commented 7 years ago

Yeah, I have a lot of browsing history in chrome. It gets saved in a sqlite3 database on your computer. http://superuser.com/questions/602252/can-chrome-browser-history-be-exported-to-an-html-file Also on your phone: http://android.stackexchange.com/questions/110053/export-chrome-history-in-android

It's also supposed to be downloadable from this site: https://takeout.google.com/settings/takeout but when I tried the browser history was empty.

So, if I wrote a program to download all these pages, what would I do with them? I'm thinking of running them through a semantic classification and anomaly detection system like this one: http://www.cortical.io/demos/semantic-anomaly-detection/

jmichelz commented 7 years ago

At first you can tell the learning system (Deep learning, HTM, etc.) explicitly which pages are important. Eventually it should start to figure out why and be able to predict importance on new pages you haven't seen yet. https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software https://github.com/numenta/nupic

oresmus commented 7 years ago

if I wrote a program to download all these pages

or if you used one of the existing "web crawler" or "spider" frameworks (I recently went looking and found a few in Python, for example, as well as some robots.txt parsers)

what would I do with them?

Whatever you did, it would eventually need to end up with some kind of graph, with edges between pages (or pieces of them, like blog comments or single paragraphs) and topic phrases.

There is a lot of open-source software available to look for things like that. A first cut based on just using 2-3 adjacent words would probably be "more useful than nothing". (The open-source software can be smarter about which short sets of words to look for, and what counts as a match.)

And then you'd need some way to view those graphs and try to make sense of them. And at that point, probably you'd need many cycles of trial and error to get better data from the same pages into the graph.

And once the data was decent, you'd want a graph-browsing viewer so you could move around between pages or paragraphs, topics in them, and other pages related to those. (This might be useful even if you couldn't use that viewer to edit any connections in the graph.)

None of this is yet "OPSN" -- it is the "PSN" part. (Personal semantic network (or a small part of one), but not "overlaid".) But it would be great data to put there. If this is what interests you most, it is a possible way to get started, even though the low-level publishing/sharing parts are also needed, could be done first, and could be useful without this.

BTW if we want to discuss this more (this idea about scanning existing data to make large graphs) we should put it into a new issue (perhaps actually deleting some of these comments and moving them into the new issue), since this issue is more general (a list of ways an app might be useful, not just a single such way).

oresmus commented 7 years ago

(Here is something you (John M) might know but some other readers might not know -- the first OPSN-like software I tried to design, inspired by trying, but not much liking, Google Wave circa 2009, was inspired most directly by exactly the problem that just happened in the last few comments above -- the tendency for comment threads to wander into new topics which really ought to be classified separately, as different topics related to the main thread. I want an editing system that makes it easy (for me now, or any reader later) to select some of the comments above and reclassify them like that, so they don't clutter up this original more general issue, but nothing about them is lost. That is still one of the early kinds of UIs I want to support, once an underlying OPSN data layer is working.)

jmichelz commented 7 years ago

I'm not trying to classify things by difficulty just yet, but by desirability. We were talking on the phone about the idea of building up an initial graph by picking a number of famous individuals (maybe in one niche) and making an OPSN of their public writings. This would generalize the "scan a blog and its comments" idea to not just one seed url (a person's writings could be on any number of host urls, so we we would need a way to find them, perhaps using a google search) and to more than one person. Then if there were existing discussion groups about those people/topics, we could present the resulting OPSN to get feedback on if it was useful.

oresmus commented 7 years ago

Yes, I agree with all that, and especially with the value of that feedback. (And of course we could do it for communities (or "niches") one of us is already part of, so we could evaluate the results using our own understanding of those communities.)

Also, that graph could be made (and feedback gotten) entirely "offline", without making a server or web app -- just doing the scanning and graph-making and some kind of graph analysis and viewing. So that is one "parallel development thread" that could be worked on.

(It's possible other people have already tried things like that. If we find any examples we should record them here.)

oresmus commented 7 years ago

We should probably replace this thread with a wiki page pointing to a list of app ideas. It's getting too confusing here.

But first I want to add one more idea for an early application use-case to target: a replacement (partial semi-clone) for Google+ (but decentralized). I know of a bunch of people already looking for that, who would eagerly move (or at least copy) their existing and new G+ writings into that, if it worked decently and the move could be automated. (And Google already permits an author copying their own post data out. I'm not sure about other people's comments on those posts, but maybe.)

oresmus commented 7 years ago

We should add the existing use cases above, and any new ones, to the wiki page https://github.com/OPSN/MVP-discuss/wiki/Use-cases .