unwriter / Bitcom

Bitcoin Computer
54 stars 16 forks source link

Bitcoin Script Schema #3

Open unwriter opened 5 years ago

unwriter commented 5 years ago

Bitcoin Script Schema

A Schema for Bitcoin Script, Stored on Bitcoin.

Define, Publish, and Consume Bitcoin Script Schema, over Bitcoin.

preview

Problem

In the past when someone wanted to create an "overlay protocol" on top of Bitcoin script, they didn't have many options but to do it the old way of storing the protocol specification:

  1. In an unstructured document (As a "spec documentation" README.md)
  2. On a centralized repository (like Github)

But this is far from ideal because:

  1. Unstructured documentation can be intepreted subjectively by multiple different parties, and there is no way to "enforce" which interpretation is the right one when the original spec was written in an ambiguous manner. It is no secret that the very whitepaper of Bitcoin has been interpreted by humans in thousands of subjective ways, just like blind men touching an elephant.
  2. Unstructured means NOT machine processable: We want these protocols to be processed by autonomous machines on Bitcoin in the future, and to do that we need a machine processable description language for defining protocols.
  3. Centralized repository means it is mutable and susceptible to human politics. All protocols must be set in stone from day 1 and if there ever is a change it should be recorded on-chain 100% transparently.

Solution

This proposal solves above problems with a Bitcoin script schema that is:

  1. Human Friendly: Easily human readable thanks to the declarative syntax
  2. Machine Friendly: Designed to be programmatically processed by machines, with minimal ambiguity, using a portable JSON format
  3. Bitcoin Native: Based on the native push data structure of Bitcoin script itself.
  4. Transparent and Decentralized: The schema itself is published on Bitcoin.
  5. Easily queryable: Easily queryable because it's based on Bitcoin script push data instead of another foreign embedded data structure.
  6. Works today: It's not some pseudo-science theory. It works today because all the tools are there, you can start using it right now.

How to Use

In this section we will walk through a workflow of how this schema can be used:

  1. Define Schema
  2. Publish Schema to Bitcoin
  3. Usage
  4. Advanced Usage

1. Define Schema

There are two things to note about Bitcoin scripts:

  1. Linear: Bitcoin scripts are 2-dimensional code.
  2. Various Encoding: The push data could come in various encoding, such as hex format or ASCII format, or binary.

For a schema to work, we should be able to express what each push data means while taking above factors into account.

And in this proposal, I suggest a schema scheme that's inspired by the BitDB transaction serialization format, because BitDB has already solved all these problems for its own use case. (However anyone can come up with their own schema scheme and replace with their scheme as well, because the idea itself is applicable to any type of schema).

Example - B://

here's an example schema for the B:// protocol:

{
  "v": 1,
  "s": {
    "out.s2": "{{blob}}",
    "out.s3": "{{mediatype}}"
  }
}

Inside the s, we specify key/value pairs of Bitcoin script push data & its values. If a value is wrapped inside a {{ }} pair, that means it's a schema attribute.

In above case, the schema is saying:

  1. out.s2 is used to represent blob.
  2. out.s3 is used to represent mediatype.

2. Publish Schema to Bitcoin

So now that we've decided on the schema, how do we publish it to the blockchain itself?

You can do this using the Bitcom scheme. More specifically you can use the $ echo command to write a file to the protocol's root folder. (Note that this syntax is based on the upcoming "Bitcom as OS" update), and therefore the $ echo command is embedded into the 19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut application protocol (instead of Bitcom being a standalone protocol)

Here's an example of writing the schema to B:// protocol's root folder, using the Bitcom scheme:

OP_RETURN
  19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut
  $
  echo
  {"v":1,"s":{"out.s2":"{{blob}}","out.s3":"{{mediatype}}"}}
  to
  schema.json

3. Usage

Now that the schema has been published, you have opened up an "official" API endpoint for your protocol.

This means now others can look up the schema to understand what each push data means in your protocol transactions.

Here's a Bitquery to find the schema transaction:

{
  "v": 3,
  "q": {
    "find": {
      "out.s1": "19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut",
      "out.s2": "$",
      "out.s3": "echo",
      "out.s5": "to",
      "out.s6": "schema.json"
    }
  },
  "r": {
    "f": "[{schema: .out[0].s4}]
  }
}

4. Advanced Usage

The example above is a simple one because there's only a single pattern. But sometimes you may want to have a single protocol with multiple patterns. You may want this type of protocol management if you want a single Bitcoin address to control multiple "API points" for a single protocol via the Bitcom scheme.

Let's think of a hypothetical Memo.cash clone that uses this scheme. This app will have two APIs:

  1. Set username: OP_RETURN 17yyXL4raLZFU95ixkRESa2ZBPSSYxSsS5 0x01 Johndoe
  2. Post a message: OP_RETURN 17yyXL4raLZFU95ixkRESa2ZBPSSYxSsS5 0x02 Hello

Let's describe a schema. This time we have two APIs to describe instead of one, so we will put it inside an array:

{
  "v": 1,
  "s": [
    {"out.h2":"01", "out.s3":"{{username}}"},
    {"out.h2":"02", "out.s3":"{{message}}"}
  ]
}

The difference here is that h2 attributes are not variables wrapped in {{ }} pairs. They are static values, so they are used to pattern match in the following manner:

  1. if out.h2 is 01, then out.s3 matches to "{{username}}"
  2. if out.h2 is 02, then out.s3 matches to "{{message}}"

We can publish this schema like this, using the bitcom scheme:

OP_RETURN
  17yyXL4raLZFU95ixkRESa2ZBPSSYxSsS5
  $
  echo
  {"v":1,"s":[{"out.h2":"01","out.s3":"{{username}}"},{"out.h2":"02","out.s3":"{{message}}"}]}
  to
  schema.json

Case Study: File Metadata API

Let's take a look at how this can be applied in a real world scenario. We will use a real world example protocol: B://.

Step 1. Define a schema

A. Define B:// Schema

First we define a B:// schema for uploading blobs to the blockchain.

Since we are discussing a new way of doing things, let's assume for a moment that we are using a different version of B:// spec where it ONLY has a single attribute: {{blob}}.

In this hypothetical version, B:// acts purely as a "blob upload scheme". It only has one attribute: blob.

{
  "v": 1,
  "s": {
    "out.s2": "{{blob}}"
  }
}

We publish this schema to Bitcoin with an OP_RETURN:

OP_RETURN 
  19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut
  $
  echo
  {"v":1,"s":{"out.s2":"{{blob}}"}}
  to
  schema.json

B. Define Media Header Protocol Schema

Next, we want to attach more header metadata to the blob by utilizing the Unix Pipeline concept, instead of packing all the extra attributes into the B:// protocol.

This way B:// can stay minimal, as a purely raw blob upload protocol, and developers can build extension protocols on top of the blob protocol simply by piping B:// into their header metadata protocols.

Let's define a simple metadata protocol schema as an example:

{
  "v": 1,
  "s": {
    "out.s2": "{{mediatype}}",
    "out.s3": "{{filename}}"
  }
}

We also publish this schema to Bitcoin using an OP_RETURN:

OP_RETURN
  18SuCAXiTgcq5Wj7J91JSkKhrqQ16qPQxW
  $
  echo
  {"v":1,"s":{"out.s2":"{{mediatype}}","out.s3":"{{filename}}"}}
  to
  schema.json

C. Usage

Now that we've defined two protocols:

  1. A blob upload protocol: 19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut
  2. A file metadata protocol: 18SuCAXiTgcq5Wj7J91JSkKhrqQ16qPQxW

let's pipe them.

Here's an actual usage of the two protocols being pipelined to:

  1. Upload a blob, pipe it to the next step.
  2. Attach metadata and return.
OP_RETURN
  19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut [File_Buffer_Data] |
  18SuCAXiTgcq5Wj7J91JSkKhrqQ16qPQxW image/png logo.png

D. Interpret

Now, blob viewer services can interpret above OP_RETURN transaction by:

  1. Reading the schema for each protocol
  2. Parsing the OP_RETURN from the schema

To read the schema for 19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut, we can query BitDB using:

{
  "v": 3,
  "q": {
    "find": {
      "in.e.a": "19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut",
      "out.s1": "19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut",
      "out.s2": "$",
      "out.s3": "echo",
      "out.s5": "to",
      "out.s6": "schema.json"
    }
  }
}

Next, to read the schema for 18SuCAXiTgcq5Wj7J91JSkKhrqQ16qPQxW, we can query BitDB using:

{
  "v": 3,
  "q": {
    "find": {
      "in.e.a": "18SuCAXiTgcq5Wj7J91JSkKhrqQ16qPQxW",
      "out.s1": "18SuCAXiTgcq5Wj7J91JSkKhrqQ16qPQxW",
      "out.s2": "$",
      "out.s3": "echo",
      "out.s5": "to",
      "out.s6": "schema.json"
    }
  }
}

Above queries will return the schema files for both protocols:

{
  "v": 1,
  "s": {
    "out.s2": "{{blob}}"
  }
}

and

{
  "v": 1,
  "s": {
    "out.s2": "{{mediatype}}",
    "out.s3": "{{filename}}"
  }
}

Now we can finally go back to the original OP_RETURN and interpret what each push data means:

OP_RETURN
  19HxigV4QyBv3tHpQVcUEQyq1pzZVdoAut [...] |
  18SuCAXiTgcq5Wj7J91JSkKhrqQ16qPQxW image/png logo.png

And the interpretations are:

  1. According to the first schema, [...] is a {{blob}}!!
  2. According to the second schema, image/png is a {{mediatype}}, and logo.png is a {{filename}}!!

Extensibility

This schema scheme is based on how Bitcoin scripts natively work: It's based on Bitcoin push data sequencing and encoding.

This means this can be further extended in the future to describe:

  1. Non OP_RETURN output scripts
  2. An entire transaction (not just outputs)

Conclusion

The Bitcoin Script Schema is just one proposal to declaratively describe a Bitcoin script template, which makes it easy to:

  1. Not only filter Bitcoin scripts using a declarative and portable query language.
  2. But also tie in seamlessly into how Bitcoin scripts work natively.

There can be other indexing and schema description schemes, but the best ones will probably be very closely integrated with how Bitcoin script natively works.

The current proposal is one such attempt, and one that can be used Today.

The great thing about this approach is that we can easily switch out the schema to a new scheme if the community comes up with a better scheme.

shruggr commented 5 years ago

I don't fully understanding the purpose of the bitcom syntax. Wouldn't it be cleaner and easier to query to have a separate schema definition protocol so you are not querying for "echo" "to" and "shcema.json"?

A protocol owner could publish the schema using OP_RETURN <schema protocol> <protocol address> <schema> and transmit the transaction signed by the protocol private key

shruggr commented 5 years ago

This way you could query any protocol schema by

{
  "v": 3,
  "q": {
    "find": {
      "in.e.a": "18SuCAXiTgcq5Wj7J91JSkKhrqQ16qPQxW",
      "out.s1": "<schema protocol address>",
      "out.s2": "18SuCAXiTgcq5Wj7J91JSkKhrqQ16qPQxW",
    }
  }
}