dlang / dub

Package and build management system for D
MIT License
678 stars 227 forks source link

using protobuf instead of json for dub.json + simple migration path #1359

Closed timotheecour closed 5 years ago

timotheecour commented 6 years ago

@s-ludwig @MartinNowak

Protobufs (ported to D: https://github.com/msoucy/dproto) have many advantages compared to json:

NOTE: there's a way to automatically convert json to protobuf, so it could actually be implemented quite easily on dub side, and still accept existing dub.json files:

dub.d:

DubProto data; // the protobuf msg
if (`dub.prototxt`.exists)
  data=DubProto(`dub.prototxt`.readText);
else if(`dub.json`.exists)
  DubProto.fromJson(`dub.json`.readText);

writeln(data.name, " ", data.dependencies);
// dub.proto (schema)

message DubProto{
  optional string name=1;
  repeated string importPaths=2;
  repeated Dependency dependencies=3;
  // ...
}

message Dependency{
  optional string path=1;
  optional string version=2;
  optional string url=3;
  // ...
}
# dub.prototxt
name: "test1"
sourcePaths: [ "source1", "source2"]
dependencies: [
  { name: "foo", path: "some/path/foo"}
  { name: "bar"}
]
# misspelling: would be caught automatically by protobuf parser
importPath: [ "import1", "import2"]

instead of:

{
"name":"test1",
"sourcePaths": [ "source1", "source2"],
"dependencies": [
  "foo": {"path" : "some/path/foo"},
  "bar": {}
],
// misspelling: dub code currently has to do the schema conformance work
"importPath": [ "import1", "import2"] 
}

NOTE: the above dub.prototxt is a modification from standard prototxt format regarding treatment of repeated fields; standard would look like:

sourcePaths: "source1"
sourcePaths: "source2"

instead of sourcePaths:["source1", "source2"]

but that's not essential; the essential is the automatic type safety (or schema conformance) we get from having a schema (dub.proto) and parsing against this schema.

rjframe commented 6 years ago

What advantages would this have over SDLang?

  • comments! (a long standing issue with vanilla json for dub.json, although one could easily add a preprocessor in dub to remove comments before parsing json)

dub.sdl can do comments. It's also less noisy than the JSON configuration files.

  • protobuf text format that's less noisy than json

The modified prototxt example you provided doesn't look much different from JSON to me; could you show what some more complicated conversions could look like (maybe vibe.d's and dub's own dub.sdl files?)?

  • builtin support for forward and backward compatibility changes in proto schema

I'm not sure I see how that's an advantage over JSON, which has a stable spec; a non-changing spec and a backward-compatible spec aren't that different, are they? (Or are we talking about the domain-specific schema? I'm not familiar with protobuf; I guess I should read up on it before commenting on this...)

-- I personally don't care about the configuration format, but really don't want to see three different formats supported. Two formats says we care about preserving backward compatibility; three says nobody can agree on anything, however trivial the decision. If this is implemented, I'd rather see dub.json auto-converted to protobuf files rather than read from both formats in perpetuity.

timotheecour commented 6 years ago

What advantages would this have over SDLang?

main one is having a schema, ie ensuring schema conformance.

The modified prototxt example you provided doesn't look much different from JSON

well, no comments feasible in json, so that's a big plus over JSON; but again main advantage is schema.

I'm not sure I see how that's an advantage over JSON, which has a stable spec; a non-changing spec and a backward-compatible spec aren't that different, are they? (Or are we talking about the domain-specific schema? I'm not familiar with protobuf; I guess I should read up on it before commenting on this...)

Indeed, not talking about JSON or protobuf having stable spec, I'm talking about domain specific spec (the schema).

Schema can be enforced at parse time (which itself can even be done at compile time but that's a side issue; eg if using import("dub.prototxt) ); that would allow removing all user input validation or Json parsing from dub.

I personally don't care about the configuration format, but really don't want to see three different formats supported. Two formats says we care about preserving backward compatibility; three says nobody can agree on anything, however trivial the decision. If this is implemented, I'd rather see dub.json auto-converted to protobuf files rather than read from both formats in perpetuity.

As I suggested above (maybe poorly), we can also not introduce the prototxt format at all but do internal conversion in dub of user's json to a DubProto, so dub doesn't have to do any json validation and json parsing:

module dub.main;
// the schema; defines `DubProto` at compile time
mixin ProtocolBufferFromString!(import("dub.proto"));

void main(){
  // 1 liner to get user data in DubProto struct, do all validation etc
  auto proto="dub.json".readText. JsonStringToMessage!DubProto;

  string[] importPaths = imports=proto.importPaths;
  // etc.
}
andre2007 commented 5 years ago

Json is well established in the ecosystem (IDEs, build tools, language servers, ...). Switching to another format would break these tools. The benefit of switching to another format doesn't seem high enough. Please reopen this issue if you have another opinion.