universal-meaning-map / mind-map

Global domain mind map tool build on top of IPFS and IPLD
https://github.com/interplanetarymindmap/mind-map
14 stars 1 forks source link

Atomic data structure #4

Open xavivives opened 5 years ago

xavivives commented 5 years ago

Atomic data structure

Semantic triple

In the original specs of the mindmap, we mentioned that our priority was to figure out the best data structure.

Without knowing about them we end up with the semantic triple constructions where: origin = subject type = verb target = object

At this point, it seems clear that those three elements are essential in order to represent meaningful connections between content in a distributed system. I'm happy to discuss alternatives though.

What is not clear, and this issue is what this is about, is how to structure this data.

Original and Alternative approaches

The original structure (O) was designed as such:

{
    "origin":"Mars",
    "relations": [
        {
            "target": "red",
            "type": "is"
        }
    ]
}

And we have extensively discussed this alternative (A):

{
    "origin":"Mars",
    "target": "red",
    "type": "is"
}

The difference is that O allows to attach multiple relations around the same origin, basically creating its definition.

I believe this is a very powerful construction. It is further developed here. Beyond semantics. Literal definition trees

While we can achieve the same by listing a set of A structures with the same origin, we can't guarantee it.

Take the following examples to represent a more complex idea around "Mars"... O:

 {
    "origin":"Mars",
    "relations": [
        {
            "target": "red",
            "type": "is"
        },
        {
            "target": "planet",
            "type": "is"
        },

    ]
}

A (we have to wrap the array inside an object to be a valid JSON)

{
    "definition":[
        {
            "origin":"Mars",
            "target": "red",
            "type": "is"
        },
        {
            "origin":"Mars",
            "target": "planet",
            "type": "is"
        }
    ]
}

They both express the same, but in the A example someone could corrupt the definition by using different origins

{
    "definition":[
        {
            "origin":"Mars",
            "target": "red",
            "type": "is"
        },
        {
            "origin":"Earth",
            "target": "planet",
            "type": "is"
        }
    ]
}

While with the O construction this is not possible.

xavivives commented 5 years ago

Do we need origin?

Here I argue how a defined set of relations around the same origin becomes its definition. The hash of this set is its identifier.

Within this context what is origin? It is only used as a representation of this definition, or as an anchor point to find commonalities with other nodes. And it seems quite obvious that those are render responsblities, and the structure we're defining here should not have any built in assumptions.

So we can now transform it into another relation

Taking the original example:

{
    "origin":"Mars",
    "relations": [
        {
            "target": "red",
            "type": "is"
        }
    ]
}

becomes...

{
    "relations": [
        {
            "target": "Mars",
            "type": "name"
        },
        {
            "target": "red",
            "type": "is"
        }
    ]
}
xavivives commented 5 years ago

Isn't then just a plain object with properties?

Probably then it makes sense to remove the relations altogether. We assume type is the key and target is its value.

{
    "name":"Mars",
    "is":"red"
}

This is stupidly simple and kind of obvious now... Just an object with properties...

If we have multiple targets with the same type (which should be a possible construction) we would go against IPLD desires

PLD paths MUST be unambiguous. A given path string MUST always deterministically traverse to the same object. (e.g. avoid duplicating link names)

We could easily go around it by putting all the targets that have the same type inside an array.

{
    "name":"Mars",
    "is":["red","a planet"],
}
xavivives commented 5 years ago

Dealing with Merkle-paths

To simplify, in the previous examples I've been using plain strings where there should be Merkle-paths instead...

{
    "name":"Mars",
    "is":"red",
}

should be:

{
    "ipfs/QmbcyXjxFcPkXcdrVi8YxhrSz6fXHw1VCcuxiF57V3fqeP":"ipfs/QmZijpFzuUFF4LwBr9PxsSTdVvfF6E6Fueiz5wLTA6MTrM",
    "ipfs/QmSYHfhVHxKLDu6QjF6QgY27AFYbapJaTeCYKpvQCs7DJb":"ipfs/QmRFQZXghkboZDQEroHAYBbRmK8YcDaKv1Hnqt89kCdQCF"
}

To my current understanding, while this does not conflict with the IPLD specs itself, it makes the value paths not traversable. Therefore should be encapsulated inside the "/" object: `

{
    "ipfs/QmbcyXjxFcPkXcdrVi8YxhrSz6fXHw1VCcuxiF57V3fqeP":{"/":"ipfs/QmZijpFzuUFF4LwBr9PxsSTdVvfF6E6Fueiz5wLTA6MTrM"},
    "ipfs/QmSYHfhVHxKLDu6QjF6QgY27AFYbapJaTeCYKpvQCs7DJb":{"/":"ipfs/QmRFQZXghkboZDQEroHAYBbRmK8YcDaKv1Hnqt89kCdQCF"}
}

This, of course, can't be applied to the keys and it probably doesn't make sense anyway.

It may create some trouble though. Assuming zdpuAvaondtCC6pT3MoZja1MX75cF43ge7puSu5UA8AmmEJyX as the IPLD object CID; a Merkle-path pointing to "Mars" would be:

ipfs/zdpuAvaondtCC6pT3MoZja1MX75cF43ge7puSu5UA8AmmEJyX/ipfs/QmbcyXjxFcPkXcdrVi8YxhrSz6fXHw1VCcuxiF57V3fqeP

My naive guess is that in this case the slashes would have to be replaced by another character or we may have to do some sort of URL Encoding.

ipfs/zdpuAvaondtCC6pT3MoZja1MX75cF43ge7puSu5UA8AmmEJyX/ipfs%2FQmbcyXjxFcPkXcdrVi8YxhrSz6fXHw1VCcuxiF57V3fqeP

It does seem relatively doable to solve.