Open benel opened 8 years ago
In order to give a possible solution we will use CouchDB. CouchDB allows to do two ways links pretty easily and respects the philosophy of Xanadu : everything is a document.
{
"_id": "A",
"_rev": "123",
"data": "document_data"
}
As T. Nelson envisioned a link has a native document which is its native home. A link can be in any document depending where the user created it. With that said, a link from A to B could live in a document X. A document can "store" one or more links as follow
{
"_id": "X",
"_rev": "234",
"data": "Hey, I'm document X",
"links": [
{
"from": "A",
"to": "B"
},
{
"from": "N",
"to": "X"
}
]
}
function(doc) {
if(doc.links) {
doc.links.forEach(function(link) {
emit([link.from, 1], { _id: link.to });
emit([link.to, 1], { _id: link.from });
});
}
emit([doc._id, 0], doc.text);
}
The two way link is made thanks to the two emits inside the forEach
loop. The last emit
is about emitting all documents.
In the key we use an integer to force CouchDB to order its result by the principe of View collation.
Putting an object { _id: id }
for the links result will allow CouchDB to populate these doc and return their content instead of only their id. More details.
Now to query the result for one document and its links, the URI for document A
would be the following :
/?include_docs=true&start_key=[A]&end_key=[A, 2]
Nice.
Note: You can simplify [link.from, 1]
with just link.from
(and of course [link.to, 1]
with link.to
).
Then, call the view with ?include_docs=true&start_key=["A"]&end_key=["A", {}]
Moreover, you don't need to emit ([doc._id, 0], doc.text)
, since include_docs
should bring the whole document.
Hmm. Sorry you were right. Using _id
links prevents from linking to the original document.
Therefore your solution was right. Don't consider my previous comment.
To finish documenting this first step:
from
and to
the focused document,{
"_id": "A",
"_rev": "2-68f2e8a0ba3324d80a49b728982e3518",
"data": "Hi document A here!",
"links": [
{
"from": "C",
"to": "A",
"type": "correction"
}
]
},
{
"_id": "B",
"_rev": "1-601eb6aa9ca5bdacc2efe436220f9d15",
"data": "[Binary File] B"
},
{
"_id": "C",
"_rev": "3-0513f13a23fc954b6909e8f8f5a6f0d1",
"data": "This is document C",
"links": [
{
"from": "A",
"to": "B",
"type": "modal_jump"
},
{
"from": "C",
"to": "B",
"type": "translation",
"language": "frFR"
}
]
}
function(doc) {
if(doc.links) {
doc.links.forEach(function(link) {
emit([link.from, 1], { _id: link.to, type: link.type, source: "from" });
emit([link.to, 1], { _id: link.from, type: link.type, source: "to" });
});
}
emit([doc._id, 0], doc.data);
}
Finding any document in the docuverse could be done with a consistent hashing.
CoucheDB's storage model uses unique IDs to save and retrieve documents.
Thanks to this model it would be possible to create a hash system which locates any document by only knowing their ID. Moreover this approach allows to implements a tumbler span according to Xanadu's model.
@Slaals
When I asked for "distribution", I did not mean "automatic distribution" (aka "cluster") but "natural distribution" related to organizations users belong to (similarly to the distribution of the Web into "sites").
The aim of my question is to match "distribution" as envisioned by Nelson and the theory of distribution of MapReduce.
@benel I'm not sure to understand. What do you mean by "natural distribution"?
In Nelson's document the chapter where he talks about "distribution" he points out the technical aspect of ditributing documents through multiple servers (or nodes), it makes me think of server clustering with MapReduce...
In Nelson's document the chapter where he talks about "distribution" he points out the technical aspect of distributing documents through multiple servers (or nodes), it makes me think of server clustering with MapReduce...
It is similar technically ("view merging" is exactly what I want you to investigate), however the aim is totally different.
Let's take an example:
Resources about courses at UTT are distributed among "elearning.utt.fr" and "etu.utt.fr". Being on a server or the other is not dependent on "consistent hashing", they are not nodes of the same cluster you could use indiscriminately. They correspond to different communities with different access rights. Students wrote in the student forum because they could not write in the official university site (at least in the main description), or because they didn't want the faculty staff to read their comments.
Well, I'm still not sure to understand, despite reading over and over Xanadu documents I have. So I'll try an approach I find relevant to your question and your example.
@benel Those 2 quotes below are Xanadu model parts I find to be related to your example :
It is desirable for documents to carry information on how to show and manipulate them -- that is, general information for the front-end designer. Instructions to front ends for display and manipulation may be in the form of text explanations or programs
The fundamental operation of the Xanadu system is the request, usually a request for links (or their attached contents) fulfilling certain criteria. These criteria can become remarkably complex. The system is designed so that you can ask for certain types of links, and those pointing to and from certain places, with total flexibility. [...] Consider a typical command, the one for finding the number of links of a certain type. The command requires four endsets:
- the home-set, those spans of the docuverse in which desired links are to be found;
- the from-set, those spans of the docuverse wanted at the first side of the links;
- the to-set, those spans of the docuverse wanted at the second side of the links;
- the three-set, spans covering the types of link that are wanted in the request.
To outline, we have 4 parameters which could interest us for our MapReduce solution :
Given those 4 parameters each server could then enhanced their Map function (the one I pushed) to get documents with filtered three-set. Then, a Reduce function could be made in order to merge each document without their filtered links, and which adds a field that says what is/are the filter(s) used. Thus, each server would have its own way to present documents because of the free use of the three-set filter.
The spirit of Xanadu stipulates that a general information could exist in documents to tell front-end designers how documents should be presented, within our context with a Map/Reduce solution, the so-called "general information" would be the Map/Reduce function written and stored in the server. In fact, the goal of the function is to filter links and to create a "filters" field in order to give an hint in the response; in orther words returning the general information. Moreover, this exact same Map/Reduce function is also what Nelson called "typical command" such as FINDLINKSFROMTOTHREE.
Following your example. Let's say we have a document A which is about courses and have some comment links. We have two Xanadu servers called "elearning" and "etu". The location of document A is not important, each server could have a virtual copy of document A. Now, each server aim to show this same document in different ways, the first one avoid showing the comments link and restrict any possibility to comment the document, and the second show everything about document A. The solution would be to have in each server a Map/Reduce function which is slightly different, the first one aim to filter comment links, the second one do not. Then, it would be the front-end designer task to present the document. What Nelson called "general information" is now typically the server response saying that there is a filter, or some, or not. Given this information, he is now able to present the document as expected.
Specifications
A Xanadu link is a connective unit, a package of connecting or marking information. It is owned by a user. It is put in by a user, and thereafter maintained by the back end through the back end's inner indexing mechanisms.
Every link has an address in at least one document. These are its home documents, where it either originated or has been included by virtual copy. The original home document of a link is called its native document, the place it was created.
The front end has no access to the link's internal mechanisms or raw data, but only to its behavior as defined by the FEBE protocol.