Open achingbrain opened 3 years ago
@Stebalien : to link to the relevant project proposals that would fix the right way.
The tricky part is mixing multibase-encoded arguments along with non-multibase-encoded arguments.
A possible solution is to reserve something like @...
to mean "the following is multibase encoded". That would be a breaking change, but might be less terrible than some of the other options.
but might be less terrible than some of the other options.
There are a bunch of options available here. One of these perhaps more terrible options, that's similar to the proposal above but trades off verbosity for transparency is having binary options get two flags --field-name
and one of {--field-name-binary
, --field-name-multibase
, or --field-name-uses-multibase
}. We'd need to do something similar (although perhaps not exactly the same) for arguments too.
Whatever we go with, it would be good to put “multibase” in the name of the option/feature. Implying multibase in options marked “binary” is going to be more confusing for users, if we say “multibase” they’ll know what to Google to figure out what that means.
The tricky part is mixing multibase-encoded arguments along with non-multibase-encoded arguments.
I don't think we should allow mixing. When it comes to /api/v0/pubsub/pub
both parameters should be in multibase because people use blobs in topic names as well (pretty sure I've seen it in the wild a while ago).
In the past we solved a variant of this problem in /api/v0/object/get
by adding support for &data-encoding=base64
, but it was scoped to a specific field in request body (object/get.js).
To remove surface for errors like this from the entire HTTP API we need to add support for multibase envelope to arg
params AND body in both request AND response. Not sure if we can do less, and avoid the problem coming back in the future.
I suggest a single flag &multibase-foobar=base64url
(multibase-rpc
? multibase-io
? multibase-envelope
?)
that informs the /api/v0/
endpoint that:
arg
parameters and body present in request need to be unwrapped using multibaseFlag would be off by default (to maintain backward-compatibility), but js-ipfs-http-client should use this flag by default on /api/v0/pubsub/pub
and similar endponts, just like it already does in object/get.js, quietly solving the problem for everyone who uses the lib. For everyone else experiencing issues when pushing binary instead of text, we would add a section at the top of https://docs.ipfs.io/reference/http/api/ (by updating ipfs/http-api-docs).
ps. Does not need to be base64url
, but it is space-efficient and the safest default, so we should use it in examples and docs.
There is a potential footgun with base64
when used in URL (and we would use it in arg
). – we should detect it and return an error suggesting base64url
instead.
How about:
Make it /api/v1/pubsub/pub
, /api/v1/dht/put
etc and demand multibase for all binary sent on the query string or as fields in JSON instead of adding additional arguments which add complexity and the potential for mis-use.
No need to rev the version number for all endpoints at once, just a few at a time as and when they need upgrading.
Versioning via /api/v1
would also work, but introduction of a new API endpoint root not only complicates our code, but also complicates implementation of http client libraries in other languages, and may even require people to update their reverse proxy configs (if they only exposed v0).
At the end of the day we want to provide an easy fix for people, and a new query param feels like less work for the end user of the API: one can add it to the map passed to the API call, in some languages without even modifying the underlying library.
Not feeling strongly about this, but I feel we should pick the least expensive option for the existing ecosystem.
introduction of a new API endpoint root not only complicates our code
Kind of the opposite - I think it simplifies our code as with fewer options you have fewer branches to test. The more options you allow, the more execution paths there are in the code, and the more tests you need due to the combinatorial explosion.
We also don't have to keep on top of making options uniform across different APIs - see --datafieldenc
, --data-encoding
, --inputenc
, --encoding
etc for examples of this which makes our lives easier and our APIs more pleasant.
but also complicates implementation of http client libraries in other languages, and may even require people to update their reverse proxy configs (if they only exposed v0).
This doesn't seem like a deal breaker to me, just switch to use the new endpoint and encode your input args, no need for the user of the http client to change anything. You might have to add an extra line or a wildcard to your nginx config in the second case.
People using the API without a client will need to change how they use the API anyway, so we can take the opportunity to reduce our maintenance overhead by having fewer code paths to maintain and test. v0
of a given endpoint could even be extracted into a module that's rarely if ever updated.
a new query param feels like less work for the end user of the API
It usually does, and it's a quick hack for the implementer of the API as well, but over time adding options results in increased maintenance overhead for the implementer due to the resulting spaghetti code and conceptual overhead for the user as they now have to understand more things about the API in order to use it.
This is blocked on side conversations that are being had to determine the way forward. See https://hackmd.io/cwWkQyXrRo-JZGUyAqXFvw
Decision summary after today's call:
pubsub
commands – that work is described and tracked in https://github.com/ipfs/go-ipfs/issues/7939#issuecomment-833812863Potential low priority tasks here:
arg
issue with dht/*
commands is a lower priority, because it does not seem to impact real world use.
If there is a bug filled for this, we will add &arg-multibase=true
flag to enable passing of binary keys as multibase string, but for now leave things as-is.--help
text clarification, but also a low priority (PR welcome).
7939 exposed two issues with the HTTP API. A solution for one was proposed, the other was discussed at triage and I wanted to capture the scope of the work here.
Some of our API endpoints accept query string arguments that contain strings which will be interpreted as binary data. This causes problems for browsers and other JS clients in that the built in URL encoding functions accept strings and do UTF8 multi-byte code point conversions for byte values above
127
/0x7F
, so0x80
becomes%EF%BF%BD
instead of%80
.Instead we should allow sending multibase encoded strings for these arguments removing any ambiguity over their contents. The proposal was to add an additional flag that indicates to the API that the received string should be interpreted as a multibase encoded string and not a byte array.
I've done a quick pass over the HTTP API docs and these endpoints currently interpret an argument as a byte array, the good news is there aren't many:
/api/v0/dht/findpeer
-arg
needs encoding/api/v0/dht/findprovs
-arg
needs encoding/api/v0/dht/provide
-arg
needs encoding/api/v0/dht/put
-arg
needs encoding/api/v0/dht/query
-arg
needs encoding/api/v0/pubsub/pub
- secondarg
needs encoding (also accepts payload in message body?)These endpoints accept arguments that by convention would be B58 encoded multibase strings containing CIDs, but it is not explicit in the docs so could use a note saying the value
'should be an IPFS path or multibase encoded string containing a CID'
or similar depending on context./api/v0/filestore/ls
-arg
may need encoding note?/api/v0/filestore/verify
-arg
may need encoding note?/api/v0/id
-arg
may need encoding note?/api/v0/name/resolve
-arg
may need encoding note?/api/v0/object/diff
-arg
may need encoding note?/api/v0/object/patch/add-link
-arg
may need encoding note?/api/v0/object/patch/append-data
-arg
may need encoding note?/api/v0/object/patch/rm-link
-arg
may need encoding note?/api/v0/object/patch/set-data
-arg
may need encoding note?/api/v0/ping
-arg
may need encoding note?/api/v0/stats/bw
-peer
may need encoding note?/api/v0/swarm/connect
-arg
may need encoding note?/api/v0/swarm/disconnect
-arg
may need encoding note?There may be other affected endpoints.