ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
16.19k stars 3.02k forks source link

Expose read-only API via Gateway #1322

Closed harlantwood closed 9 years ago

harlantwood commented 9 years ago

Context: We are creating DAG visualizations; (lots of) background info in ipfs/webui#56.

@jbenet said in that thread:

i want to expose the Read part of the API's object/get calls on the default gateway. that way js can grab raw objects :)

Perfect, me too. :smile:

Right now when we hit a given "directory" hash in the gateway

http://gateway.ipfs.io/ipfs/QmZAL3oHMQYqsV61tGvoAVtQLs1WzRe1zkkamv9qxqnDuK

we get an HTML directory listing, which includes:

<li><a href="/ipfs/QmZAL3oHMQYqsV61tGvoAVtQLs1WzRe1zkkamv9qxqnDuK/readme.md">readme.md</a> - 3891 bytes</li>

On the command line

ipfs object get QmZAL3oHMQYqsV61tGvoAVtQLs1WzRe1zkkamv9qxqnDuK

we get back JSON

{
  "Links": [
    {
      "Name": "readme.md",
      "Hash": "QmaCNTJuDUYueDDeXFwPW8PGQC93yCeFxqhXfVUHM8pPDi",
      "Size": 3891
    }
  ],
  "Data": "\u0008\u0001"
}

So what we want is a way to get the JSON from the gateway, in addition to the standard HTML.

The options I can imagine:

  1. Accept JSON via headers: We could add functionality to the IPFS gateway to serve directories either as HTML or as JSON, depending on the Accept content type request header (something like Accept: application/json; charset=utf-8). That is, keep serving the HTML index page by default, unless the HTTP request asks for JSON.
  2. .json "file extension": e.g. http://gateway.ipfs.io/ipfs/QmZAL3oHMQYqsV61tGvoAVtQLs1WzRe1zkkamv9qxqnDuK.json
  3. Path segments: Note that any path elements AFTER the hash may conflict with existing path resolution (subdirs/files), so the new path element would probably have to be BEFORE the hash. Perhaps http://gateway.ipfs.io/object/get/QmZAL3oHMQYqsV61tGvoAVtQLs1WzRe1zkkamv9qxqnDuK?

@jbenet you seemed to be thinking something other than the Accept header (option 1 above), which is my current top choice. What method do you prefer?

jbenet commented 9 years ago

I think exposing a subset of the api would work, at the same routes as the api currently works:

# api on privileged "api" server works today
http://localhost:5001/api/v0/object/get?arg=/ipfs/QmZAL3oHMQYqsV61tGvoAVtQLs1WzRe1zkkamv9qxqnDuK

# read only api should work on "read only" gateway tomorrow
http://localhost:8080/api/v0/object/get?arg=/ipfs/QmZAL3oHMQYqsV61tGvoAVtQLs1WzRe1zkkamv9qxqnDuK

(i dont like the arg= construction in this particular route too much but that's an artifact of the great generality of the cmds lib)

harlantwood commented 9 years ago

Makes sense.

harlantwood commented 9 years ago

I was going this route, and was going to load the tree one level at a time, but then I noticed that there are ways to get the entire DAG recursively for a given node, eg:

http://localhost:5001/api/v0/refs?arg=Qmcav25eTinMV632w9zdyXsFENDz5FCWjrMEVU7Nzy2v98&recursive&format=%3Csrc%3E%20%3Cdst%3E%20%3Clinkname%3E

Which returns an almost-JSON format which nonetheless has everything needed to construct the whole DAG:

{
  "Ref": "Qmcav25eTinMV632w9zdyXsFENDz5FCWjrMEVU7Nzy2v98 QmZs8mitpfSZM8TaFas9WaDVF77aQvb47UEPR1g1quoQq9 app.js\n",
  "Err": ""
}{
  "Ref": "Qmcav25eTinMV632w9zdyXsFENDz5FCWjrMEVU7Nzy2v98 QmSXq83RU9YFnxGS7N29gBqjMXTg3qHERzrfFZxKYCGknM lib\n",
  "Err": ""
}{
  "Ref": "QmSXq83RU9YFnxGS7N29gBqjMXTg3qHERzrfFZxKYCGknM Qmei6UeQ3LKeKUfzKLx8SRsmxVpvvWrLmZTkKapCoQnYgf d3\n",
  "Err": ""
}

I am now thinking of going this way in the calls from the client side dataviz ipfs/webui#56 -- but this would mean adding the /api/v0/refs route to the gateway, instead of (or in addition to) the /api/v0/object/get route.

Digging around a bit more... a recursive version of object/links would be perfect for what we need here. We currently have:

{
  "Hash": "Qmcav25eTinMV632w9zdyXsFENDz5FCWjrMEVU7Nzy2v98",
  "Links": [
    {
      "Name": "app.js",
      "Hash": "QmZs8mitpfSZM8TaFas9WaDVF77aQvb47UEPR1g1quoQq9",
      "Size": 500
    },
    {
      "Name": "lib",
      "Hash": "QmSXq83RU9YFnxGS7N29gBqjMXTg3qHERzrfFZxKYCGknM",
      "Size": 520503
    }
  ]
}

with a --recursive option, this could become:

{
  "Hash": "Qmcav25eTinMV632w9zdyXsFENDz5FCWjrMEVU7Nzy2v98",
  "Links": [
    {
      "Name": "app.js",
      "Hash": "QmZs8mitpfSZM8TaFas9WaDVF77aQvb47UEPR1g1quoQq9",
      "Size": 500
    },
    {
      "Name": "lib",
      "Hash": "QmSXq83RU9YFnxGS7N29gBqjMXTg3qHERzrfFZxKYCGknM",
      "Size": 520503,
      "Links": [
        {
          "Name": "d3.js",
          "Hash": "QmbgWP6n7wmczy9YP79FpDRUjYhyjVKjdDHTm9SS9nadZR",
          "Size": 336528
        }
      ]
    }
  ]
}

This takes the best part of the refs api ( --recursive ), and adds it to the right command IMO -- if we add it to object/get we would be getting the content of everything recursively, but here we just get links and metadata recursively.

Note that this is similar to github API allowing you to get the metadata for an entire repo as a tree recursively, eg https://api.github.com/repos/ipfs/go-ipfs/git/trees/master?recursive=1

In summary, the current ideal to support client side dataviz is AFAICT: add a --recursive option to object/links, and expose that through the gateway.

Interested in your take on this @jbenet.

jbenet commented 9 years ago

thanks @harlantwood i think you're spot on.

the only part i worry about there is that the object api currently manipulates one object at a time, and this would be a departure. that can be fine and maybe the easiest thing to do. another option is to add the functionality you mention to refs (outputs the full json dag, not just the link hashes.)

harlantwood commented 9 years ago

Sure, either is fine. The refs api via could definitely use a json output format. But no need to make it a nested tree necessarily. The current source/destination/linkname, in a reasonable json format would be fine. My intuition is that that would scale better for very large recursive ref lists than the nested version. FWIW, notice that github's api tree is flat, not nested...

harlantwood commented 9 years ago

Ah, also my viz code in progress is already parsing the current output of refs (recursive) API. So the first two of your checkboxes above are actually all we actially need for the viz.

harlantwood commented 9 years ago

@travisperson I added this one under your name in the etherpad for the current sprint. Thanks! I am a Go noob, but happy to offer any help I can, especially on the planning and testing sides.

harlantwood commented 9 years ago

Reading through this thread I'm not sure how clear the actual need of the dataviz is -- it's just one thing: for the gateway to support this call, exactly as it is currently supported by the API:

http://localhost:5001/api/v0/refs?arg=Qmcav25eTinMV632w9zdyXsFENDz5FCWjrMEVU7Nzy2v98&recursive&format=%3Csrc%3E%20%3Cdst%3E%20%3Clinkname%3E

harlantwood commented 9 years ago

@travisperson do you think you will continue work on this? If not I'll see if I can convince @jbenet or someone else to work on it... :wink:

harlantwood commented 9 years ago

Closed with #1581, thanks all for your efforts.