WardCunningham / Smallest-Federated-Wiki

This wiki innovates by: 1. federated sharing, 2. drag refactoring and 3. data visualization.
http://wardcunningham.github.com/
GNU General Public License v2.0
1.21k stars 188 forks source link

Broken Federation and Discovery/Search Speed #417

Open almereyda opened 9 years ago

almereyda commented 9 years ago

Inspired by the broken federation between two already existing Wikis, as seen here and there, not even mentionning ageing wikis, I wonder how errors are to be handled in the future, i.e. with more understandable explanations and no debug output.

Then, I have also the impression that our Discovery and Retrieval process is not the quickest. I wonder which distributed indexes, I don't know why, but I think of Hadoop and sharded databases, exist. elastic search? @rynomad, how quick do Telehash and NDN perform in this regard?

rynomad commented 9 years ago

Regarding HTTP RecursiveGet vs Te-NDN recursiveGet:

Let's say you are following a link from a page with many authors recorded in the journal, and the page you wish to find resides on only one of those servers. As currently implimented with HTTP, the client iterates through the domains present in the journal (stored in the 'localContext' var), moving on to the next only upon timeout of the previous request. If you are unlucky enough that the server you're looking for is at the bottom of that list, and the list is long, the link will take a significant amount of time to load.

NDN issues one request to your server, and trusts that the server-mesh NDN network will route the request properly to wherever the data resides. Caching at each node here is a bonus, as is the fact that the network doesn't care what your localContext is; which means that if a link CAN resolve, it WILL resolve.

The tradeoff here is keeping more state on the server, and a bit more work to get the federated network in place when you initialize wiki (a large wiki with lot's of peers/pages might take a while to reboot)

Telehash as a transport is generally both reliable and speedy, though there is the encryption overhead to take into account.

paul90 commented 9 years ago

Inspired by the broken federation between two already existing Wikis, as seen here and there

This is really an example of the lack of plugin discovery. @interstar (Phil Jones) has written the wikish plugin,

Wikish 0.1
Phil Jones

Wikish is a wiki markup, derived from UseMod's and adapted for my earlier personal wiki 
software : SdiDesk.

This is the beginning of my adaptation of it for Smallest Federated Wiki, so that I can 
port my existing wikis to SFW

As this plugin exists on thoughtstorms.info, but not fed.wiki.org you see the error TypeError: Can't find plugin for 'wikish' once for each wikish item on the page.

What is required is for the client to recognize when the origin server does not have a plugin that is required to render a page (do we also need to worry about plugin version?) had have some mechanism to reach out and find it. A question is this something best done by the client or the server. Having the server build a library/cache of none core plugins used on pages it is serving would be an attractive idea, and protects against an original server becoming unavailable. Though this would only be for client side only plugins.

Oh, and the page load with the errors is slow, as the request are made sequentially and you are using Ward's server, the requests of the missing plugin get made serially (speed is determined by the speed of the origin server) - so it only gets loaded once - but there is currently on error handling/recognition so the same request is being made repeatedly. Something that needs fixing.

not even mentioning ageing wikis

Or just broken in a different and interesting way - a couple of 500 and 404 errors, one of those 500 errors being for the stylesheet. As long is able to server the page json, the way to view the site is to use a different server as the origin, so something like http://wiki-paul90.rhcloud.com/theoutpost.io/the-outpost

Also worth reaching out to @hallahan so he know there is an issue with his server.

almereyda commented 9 years ago

Hey @paul90, just as a sidenote : I always like your verbose and explanatory narratives. :tulip:

rynomad commented 9 years ago

Ah, I see I missed what was really going wrong vis a vis plugin fetching... For what it's worth, the pattern I use for federating page fetches works just as well fetching plugin JS and CSS (I fetch the JS and turn it into an Object URL and pass it to JQuery/css tag as appropriate). That said, security is a definite concern when fetching JS and injecting it into a page, but luckily NDN has signature and verification built in, so in this case we'd need to come up with some sort of cert/web-of-trust managment strategy.

almereyda commented 9 years ago

Then we need advice from @bblfish again. Am 14.06.2014 00:23 schrieb "Ryan Bennett" notifications@github.com:

Ah, I see I missed what was really going wrong vis a vis plugin fetching... For what it's worth, the pattern I use for federating page fetches works just as well fetching plugin JS and CSS (I fetch the JS and turn it into an Object URL and pass it to JQuery/css tag as appropriate). That said, security is a definite concern when fetching JS and injecting it into a page, but luckily NDN has signature and verification built in, so in this case we'd need to come up with some sort of cert/web-of-trust managment strategy.

— Reply to this email directly or view it on GitHub https://github.com/WardCunningham/Smallest-Federated-Wiki/issues/417#issuecomment-46066409 .

interstar commented 9 years ago

I suppose a stop-gap to fully automatic federated plugin sharing would be to have a standard package-repository for plugins (like npm) where we could all contribute them. Then any SFW owner who found they'd pulled a paragraph from another wiki in an unknown format would have a standard place to look for the plugin.

BTW: that wikish plugin is here : https://github.com/interstar/ThoughtStorms/tree/master/plugins

almereyda commented 9 years ago

Yes, for sure. How blatantly overseen. wik : wiki index keys. No, or maybe yes? So in fact the repository should already be distributed, I think. So a plugin repository page would then just be special places in the flat namespace that are peer authentified or are offered for peer authentication

I also like the idea of a centralized, trusted registry, but as GitHub shows, it's a somewhat diverging movement to the proposed decentralization. Then, I'd love to find better solutions. Have we had a look in http://cjdns.info/ already? How does it compare to he NDN/Telehash couple for routing and authentication?

As Ryan mentionned; in fact everything should be encrypted and signed. Down to every commit to the journal, if you ask me. As the factory items remain the factors (sic) for any refactoring; be it plugins. [ < Does that sense make any sense in English language? Sometimes I have to reassure myself]. Slightly OT: Therefore a metadata provider for a wiki page is also just a factory with special content.

On 14 June 2014 02:06, phil jones notifications@github.com wrote:

I suppose a stop-gap to fully automatic federated plugin sharing would be to have a standard package-repository for plugins (like npm) where we could all contribute them. Then any SFW owner who found they'd pulled a paragraph from another wiki in an unknown format would have a standard place to look for the plugin.

BTW: that wikish plugin is here : https://github.com/interstar/ThoughtStorms/tree/master/plugins

— Reply to this email directly or view it on GitHub https://github.com/WardCunningham/Smallest-Federated-Wiki/issues/417#issuecomment-46072245 .

WardCunningham commented 9 years ago

We made the choice last summer to use npm as the preferred plugin registry. However, anyone operating a site is welcome to come by plugins any way they choose.

We ask that all plugins include documentation and that the documentation point to the source repository. This is a courtesy to readers. If one encounters a plugin in the wild they have the means to acquire it as their own.

Both the ruby and node servers will reveal their full complement of plugins in response to an api request.

http://fed.wiki.org/system/plugins.json

If a plugin author makes their plugin available via npm then we have the automation in place to retrieve that plugin with each install of wiki. A site operator has only to add the plugin's name to the package.json file.

We distribute a sample package.json. We would happily receive pull requests to add additional plugin names to this. We do feel some obligation to consider such requests with care as we don't want to become a distributor of malware. All readers here should register a watch on this repo and help vet pull requests for new plugins.

https://github.com/fedwiki/wiki-node/blob/master/package.json

WardCunningham commented 9 years ago

The page fetching code in lib/pageHandler applies the simplest logic that will correctly interpret a link in the context it is found. This code has more resources available to it that could be incorporated into optimizations. For example, the ajax requests could be performed in parallel and all results displayed as they arrive. Or the neighborhood's sitemaps could be examined to avoid fetches that will surely fail.

Authors share some responsibility for slow links. When they cite a page that is important to their work and don't bother to fork that page into their own wiki they are leaving this task to their reader. When authors fork the reader gets the page the author expects. The reader will be alerted if there is a newer version becomes available. Likewise a reader who has wandered far from the origin would be wise to click the flag of the page they now find interesting. This will reconfigures the reading context and will correspondingly shorten searches.

WardCunningham commented 9 years ago

This project was founded on the vision of a proliferation of servers exchanging and caching pages on our behalf. I describe this in this repo's ReadMe three years ago.

https://github.com/WardCunningham/Smallest-Federated-Wiki/commit/21f4e7576fe555ae3eb2e1316655bb61236cdb0b#commitcomment-6670041

With this comment left this morning I admit that this much hasn't worked out. I have high hopes for IPv6 and even more for NDN and similar overlay networks. However, I can't see how these technologies become anything more than neighborhoods in a more comprehensive federation.

paul90 commented 9 years ago

Something feels wrong, I wonder...

Rather than have different plugins, and content types, to provide editing with different markup. Wouldn't it be better to use a single content type paragraph for text content, with a common markup (lets say a sanitized sub-set of HTML) and, have editor plugins that provide an author selectable editor with their preferred markup?

This would mean the having a separate sanitized HTML content type might not be needed, as we could have a raw editor for that. We would have to provide a framework to support this, and allow the selection of which editor the user prefers, but then 'Editor' plugins could then be developed to support the different markups.

Also connect with #419 , fedwiki/wiki-client#7 , and very probably others...

WardCunningham commented 9 years ago

I'm understanding Paul's position better. Thank you. I took this issue to be more about offline servers and inconsistently configured sites. But having a zillion similar markups and plugins to render them doesn't sound attractive either. I fear if we choose a subset of html as the "native" markup we will forever disappoint those that want something we didn't choose. We had already given up on uniformity of markup by the time we published The Wiki Way. Instead the book suggested how one could add new markup that was useful to a community. Making a new markup is one of the joys, it seems.

interstar commented 9 years ago

@paul90 's suggestion seems to be based on the idea that content is static. But from the beginning @WardCunningham has had plugins that render live data (ie. would need to be executed at read-time, not merely at authoring-time) So I'm not sure how that would work. And anyway, it wouldn't resolve the problem because people will want to edit forked content too. So if you have special editors you'll still want to migrate them along with the data.

Unless a special editor is a one-shot thing, which wouldn't be particularly useful. For example, I have a (currently unreleased) "network-diagram" drawing plugin that renders a high-level representation of a bunch of connected nodes. At the moment I'm drawing the diagram with Canvas, but I've played with Rafael.js and SVG. Or maybe I'll port to D3 at some point. Or even WebGL. Whichever, I want to keep the model in its abstract form both to keep the flexibility to change the rendering technology and so that the data can be edited.

interstar commented 9 years ago

BTW : I didn't know about the decision to use NPM for distributing plugins. Makes sense.

I'll certainly package my plugins in this form. Can anyone point me to some existing instructions for how to do this?

paul90 commented 9 years ago

No, nothing to do with content being static. I am suggesting that for 'text' content rather than creating a different container for each different authoring markup that a different approach of having, user selectable, editors which perform a round trip from a common markup to the markup that the user is editing in, and back.

There currently seem to be three, maybe four, different markups being talked about within the community. Imagine a page that ends up with text content marked up with a number of different markups and the effect that has on anybody wishing to contribute.

For the current notes on plugin development, see http://plugins.fed.wiki.org/view/make-a-new-plugin/view/make-plugin-script

WardCunningham commented 9 years ago

Is possible to live without markup?

Here is what we already have in our markup. Links -- must have. Unicode -- must have. Newline -- makes bulk pastes work.

Here is what I use from HTML. Headings -- like to have, I only use h3 Italic -- for separation that doesn't stand out Bold -- for separation that does stand out Breaks -- in rare circumstances that I don't recall

Here is what I don't miss and wish others wouldn't use. Bullets -- small paragraphs make better bullets and links handle nesting. Tables -- this is almost page layout. better to make a csv and drop it in.

Here are "figures" for which I'd love to see great support in plugins. Images -- especially drag-and-drop of large and collected images. Code -- with pretty printing and generally pre-formatted. Videos -- new plugin, not with embed codes. Maps -- with geocoded input and temporary scrolling. Calendars -- for which there are three or four important use cases.

There are also data, computation, and visualization aspects which go beyond any sense of markup but will fit into our editing paradigm as domain specific languages.

WardCunningham commented 9 years ago

Here is a strategy forward which is not the simplest, or the easiest, but which is along the lines I've been expecting we would evolve. The initial question was about errors, missing or failing plugin errors I assume. These can be made less ugly by replacing the message with a small faint error icon that must be clicked to see what is wrong. This leaves the page readable. It might even be usefully readable for missing plugins if the text field were rendered in place without transformation. So, if you wrote in markdown, and I didn't have markdown, then I would see your markdown source. I would survive.

I further suggest that we adopt small translators for wiki focused subsets of popular markups. This list would include wikipedia, latex, markdown and html. This is mostly for the convenience of authors importing content from other wiki. It gets their bold and italic across without tedious editing.

The Factory plugin will create items from a menu. This only works well for plugins that expect to be edited with the textEditor. The Factory now organizes this menu using categories that have always been present in the plugins. The choices are format, data, other. If we stick with that theme, here is what it might look like if we add a few more plugins.

creole     geocode      map
markdown   calendar     video
mathjax    reference    code
html       method

Creole is the one we have never discussed. Creole is like simplified Wikipedia format. Its presence would complete the federated wiki's acknowledgement of other popular markups.

We could cherry-pick the features that we choose to implement in these markups to favor their strengths and avoid their weaknesses. The arbitration of what is in and out would be conducted in the repos for those translators. They could all share the same css so that the output look would be consistent.

You notice that our plain 'paragraph' format is not on the list. That's because it is the default. Its what we expect to be used when exotic markup features aren't required. Its what we will use in captions and search for to make a synopsis. It means that a mathematician can't slip a mathjax equation into an image caption. I'll take the heat for this limitation.

Aside: Of course a math-oriented site could hack their version of the Image plugin to render the caption with mathjax. But when those pages are viewed from other servers, the latex formulas would render as source. One would have to "climb the trail" to find what the original author expected.

We should offer these recommended plugins and deprecate the use of html in paragraphs at the same time. By depreciate I mean sanitize and then somehow encourage the conversion to other formats.

WardCunningham commented 9 years ago

In issue https://github.com/fedwiki/wiki-client/issues/39 we discuss the multiple advantages of fast and incremental delivery of rendered pages from the server, especially when the content is expected to be read without javascript. Its fair to ask here, what obligation does the server have to render every markup in a readable way?

Having recently worked on some minimalist servers I would say, not much. Our current favorite, the node/express implementation, could easily build with the standard translators in the page rendering path. But I wouldn't demand that. Rather, I would say that rendering the various markups I've mentioned above in their (properly escaped) native form should be permissible. This aligns with my suggestion above that the text field of missing plugins should be exposed as the best effort to make content available.

interstar commented 9 years ago

@WardCunningham The problem with using paragraphs for bullet items is that we don't have anything else for larger scale aggregation.

Very often I'll have short lists of three or four items with a title that obviously need to be kept together as a unit, but within a larger document. If I make them n separate paragraphs then there's nothing that DOES keep them together when I'm trying to refactor either within or between pages.

Most of the time, I'll put the entire list in a single paragraph and use bullets to maintain the structure, because the pain of having to edit them through "normal" cut-and-paste is far less than the pain of trying to keep them together while dragging each item around separately.

In fact, my trend is towards making fewer and larger internally structured paragraphs precisely for this reason. Although dragging and dropping in SFW is cute, it sets a very hard unit-size. And often I find that I want edit "across the borders" of this unit (eg. take the end sentence from one para and the beginning two sentences from the next and inject them into a paragraph higher up the page.).

Although when I originally converted my wiki to SFW, I made every paragraph an SFW paragraph, for new writing today I'm more likely to use SFW paragraphs as "sections" and edit their internal structure in the traditional way.

WardCunningham commented 9 years ago

@interstar Please describe some of the more exotic features of your personal wiki markup. I looked through your plugin and saw something about transclusion. How has that worked for you? Are you able to make it work as a plugin?

interstar commented 9 years ago

@WardCunningham

To be honest, I don't even remember why I was doing that Transcluder. I think it was to do with having some UseMods that hadn't been converted yet. But I don't think it's something I'm particularly committed to. My aim is to migrate all my public facing wikis to SFW.

I don't really have very exotic requirements from a text plugin. But I do want (nested) bullet lists. And I do want (simple) tables such I use here : http://thoughtstorms.info/view/netocracy (Preferably using my double-comma-as-separator markup)

I use these markup idioms too heavily to want to abandon them. Particularly as there's no other way to represent different levels of nested structure in SFW.

WardCunningham commented 9 years ago

@almereyda started this issue with an offhand question:

I wonder how errors are to be handled in the future, i.e. with more understandable explanations and no debug output.

We've now addressed these issues, first with a help button, and now by a best-effort approach to rendering item text in the absence of a working plugin. I discuss this more fully in a comment associated with the new video plugin.

https://github.com/fedwiki/wiki-client/pull/48#issuecomment-48118966

almereyda commented 7 years ago

@paul90 wrote in the third comment above two-and-a-half years ago:

This is really an example of the lack of plugin discovery.

This is nowadays approachable with:

WardCunningham commented 7 years ago

The Plugins plugin renders the result from the server's /system/plugins.json endpoint that goes back to the ruby version. This turned out to be pretty useless since results were not otherwise annotated and didn't know of any that weren't already installed.

The Plugmatic plugin is aimed at managing lists of plugins that can be shared throughout the federation and installed and/or updated by npm through the web interface. This promises to be much more useful especially when new plugins are showing up regularly. I have an unsafe version of this working using the Shell plugin and am currently converting these to the new form.

Let's think about how this scales.

I imagine one or more sites devoted to curating lists of plugins that work well together. When there are many such sites we'll share rosters so that lists are easily browsed and searched together. Associated information would advise administrators resources required and risks endured supporting the members of any list.

Such a scheme has no central authority but does requires a critical mass of participation to keep it working and relevant. I set up a similar scheme for cataloging Transporters but it never included any more than my own Transporters.

http://ward.asia.wiki.org/transporter-roster.html