iPlumb3r / KeQuarks

About Modeling Paradigm(s), Concept(s) & (Meta-)Model(s) for defining Model(s).
3 stars 2 forks source link

How to identify Concepts (in the context of the distributed web) ? #2

Open iPlumb3r opened 4 years ago

iPlumb3r commented 4 years ago

Content-based addressing (thanks to cryptographic hash) is a very elegant and powerful mechanism for identifying on the web a digital « things » (i.e. : text, photo, music, video, software, …). Rq : In fact, everything that could be represented by 0 & 1 ! => More information about this approach here : https://infocentral.org/drafts/PrinciplesDraft.html

But how to identify a Concept ? We could not use the content-based addressing method, ..
... just coz a Concept in not "made of" 0 & 1 !

So, which method could we use instead ?

At this stage, I can imagine 3 methods a :

  1. A pure random identifier
  2. A ID calculated from its several digital representations
  3. A IEML « word » and/or « sentence »

(All those methods are alternative related to the current method based on URL/URI)

A pure random identifier

About : Simple & efficent Advantage(s) :

A ID calculated from its several digital representations

About : This approach mimic the content-based approach ; but seems to take the problem in reverse, but why not explore it anyway ? Advantage(s) :

A IEML « word » and/or « sentence »

About : IEML (https://twitter.com/IEML_) is a univocal langague where existe a 1-1 relation between the semantic and the syntaxic structures (phoneme, morphene, word, ...) : https://www.topincs.com/EntangledBootstrap/2006 Advantage(s) : ...

ChrisGebhardt commented 4 years ago

This is a very tricky design concern, one that I will be detailing at length in the upcoming but oft' delayed InfoCentral Design Proposal draft. :)

The more generalized need is the ability to anchor "root" nodes in the graph, whether they represent a concrete real-world thing / data or an abstract concept. My new solution is a combination approach of hash-based IDs (HIDs) and unique value IDs (UVIDs).

Concrete roots may contain various identifying information (initial properties, etc.) and are typically cryptographically signed by a trusted author. Such roots are then referenced using standard HIDs. They are more appropriate for anchoring unique real-world objects, creative works, etc.

Abstract roots are unowned, do not contain information, and only need to be unique. This is, of course, where concepts belong. Abstract roots don't need to exist as real data entities. They are self-existent in their uniqueness relative to an unowned namespace. UVID is my term for the various approaches mentioned above: random nonces, unique strings (hashtags, folksonomy), artificial language words/phrases, etc. In order to complete the design, we only need canonical mappings from UVID schemes to HIDs. This allows them to be integrated into content-based networks, used in place* of HIDs for references in data entities, used for networked reference collection, etc. Interestingly, this also effectively serves as a default indexing scheme. Natural language words and phrases found in ordinary text can be treated as implicit UVIDs and deterministically mapped to graph nodes.

Abstract roots gain meaning by becoming woven into contexts by reference. They are externally described, mapped, and utilized. They can be specialized and generalized by mapping against other roots. Ontologies evolve to make formalized use of them, possibly indirectly through perspectives (which are themselves immutable roots!). And networking drives the most popular meaning(s) and metadata for any given root / concept. Reference collection removes any need/desire for mutable references or owned namespaces, the faulty approach being explored by most other projects.

Sidenotes: