Open Arwalk opened 6 months ago
Actually, it's in the spec! https://protobuf.dev/programming-guides/proto3/#json
Hi, i've been trying to implement this feature (started from deserializing) but i'm stuck on a few moments - maybe you will have a time and desire to help me)
First of all, std.json library (as of 0.12/0.13) is quite good and (almost) can parse a protobuffed JSON string into generated class - but the problem is ArrayList fields since JSON parser treats them as structs (it's waiting for JSON collections with fields items, capacity etc.).
My solution was to inherit an ArrayList with usingnamespace and add jsonParse function to it (see this and this), call it something like SerializableArrayList, place it somewhere at src/protobuf.zig (and also use this class in bootstrapped-generator/main.zig so all generated classes would use it). But when i did this i get lots of errors like this:
src/protobuf.zig:593:49: error: no field named 'items' in struct 'protobuf.SerializableArrayList(generated.google.protobuf.pb.Value)'
for (@field(result, field_name).items) |item| {
^~~~~
src/protobuf.zig:95:12: note: struct declared here
return struct {
^~~~~~
referenced by:
pb_deinit__anon_14611: src/protobuf.zig:572:21
deinit: src/protobuf.zig:1014:29
deinit_field__anon_14163: src/protobuf.zig:587:54
deinit_field__anon_13815: src/protobuf.zig:620:37
pb_deinit__anon_13489: src/protobuf.zig:572:21
deinit: src/protobuf.zig:1014:29
deinit_field__anon_9994: src/protobuf.zig:584:35
pb_deinit__anon_8240: src/protobuf.zig:572:21
deinit: src/protobuf.zig:1014:29
test.Required.Proto3.ProtobufInput.ValidDataRepeated.BOOL.PackedInput.ProtobufOutput: tests/alltypes.zig:80:12
Even simple example doesn't work as expected:
const std = @import("std");
const json = std.json;
const ArrayList = std.ArrayList;
const Allocator = std.mem.Allocator;
pub fn SerializableArrayList(comptime T: type) type {
return struct {
usingnamespace ArrayList(T);
pub fn jsonParse(
allocator: Allocator,
source: anytype,
options: json.ParseOptions,
) !@This() {
const slice = json.innerParse([]T, allocator, source, options);
return @This().fromOwnedSlice(allocator, slice);
}
};
}
const S = struct {
array: SerializableArrayList(u32),
};
// This one works as expected ...
pub fn main_() !void {
const allocator = std.heap.page_allocator;
const t = SerializableArrayList(u32);
var x = t.init(allocator);
defer x.deinit();
try x.append(1);
try x.append(3);
_ = try t.jsonParse(allocator, 1, .{});
const writer = std.io.getStdOut().writer();
try writer.print("{d}: {any}\n", .{ 1, x.items });
if (!std.meta.hasFn(SerializableArrayList(u32), "jsonParse")) {
@compileError("AAA");
}
}
// ... but not this one
pub fn main() !void {
const allocator = std.heap.page_allocator;
const t = SerializableArrayList(u32);
var x = t.init(allocator);
defer x.deinit();
try x.append(1);
try x.append(3);
const s_instance = S{ .array = x };
const writer = std.io.getStdOut().writer();
try writer.print(
"{d}: {any}\n",
.{ 1, @field(s_instance, "array").items },
);
}
Second main() fails at compile time with this messages:
t.zig:57:28: error: expected type 't.SerializableArrayList(u32)', found 'array_list.ArrayListAligned(u32,null)'
/home/libro/Downloads/zig/zig-0.13.0/lib/std/array_list.zig:31:12: note: struct declared here
t.zig:8:12: note: struct declared here
These are changes i made:
diff --git a/bootstrapped-generator/main.zig b/bootstrapped-generator/main.zig
index 8bb0297..018d8bd 100644
--- a/bootstrapped-generator/main.zig
+++ b/bootstrapped-generator/main.zig
@@ -98,13 +98,13 @@ const GenerationContext = struct {
try list.append(try std.fmt.allocPrint(allocator,
\\// Code generated by protoc-gen-zig
- \\ ///! package {s}
+ \\///! package {s}
\\const std = @import("std");
\\const Allocator = std.mem.Allocator;
- \\const ArrayList = std.ArrayList;
\\
\\const protobuf = @import("protobuf");
\\const ManagedString = protobuf.ManagedString;
+ \\const SerializableArrayList = protobuf.SerializableArrayList;
\\const fd = protobuf.fd;
\\
, .{name.buf}));
@@ -314,7 +314,7 @@ const GenerationContext = struct {
}
}
} else {
- prefix = "ArrayList(";
+ prefix = "SerializableArrayList(";
postfix = ")";
}
diff --git a/src/protobuf.zig b/src/protobuf.zig
index dd19967..5fa3178 100644
--- a/src/protobuf.zig
+++ b/src/protobuf.zig
@@ -3,6 +3,7 @@ const StructField = std.builtin.Type.StructField;
const isIntegral = std.meta.trait.isIntegral;
const Allocator = std.mem.Allocator;
const testing = std.testing;
+const json = std.json;
// common definitions
@@ -77,6 +78,21 @@ pub const ManagedString = union(ManagedStringTag) {
}
};
+pub fn SerializableArrayList(comptime T: type) type {
+ return struct {
+ usingnamespace ArrayList(T);
+
+ pub fn jsonParse(
+ allocator: Allocator,
+ source: anytype,
+ options: json.ParseOptions,
+ ) !@This() {
+ const slice = json.innerParse([]T, allocator, source, options);
+ return @This().fromOwnedSlice(allocator, slice);
+ }
+ };
+}
+
/// Enum describing the different field types available.
pub const FieldTypeTag = enum { Varint, FixedInt, SubMessage, String, List, PackedList, OneOf };
@@ -964,26 +980,12 @@ fn decode_json_data(comptime T: type, comptime field_desc: FieldDescriptor, comp
}
pub fn json_decode(comptime T: type, input: []const u8, allocator: Allocator) !T {
- const m : json.Parsed(json.Value.object) = try json.parseFromSlice(json.Value.object, allocator, input, .{});
- defer m.deinit();
-
- const obj : json.ObjectMap = m.value;
-
- var result = pb_init(T, allocator);
+ const parsed = try json.parseFromSlice(T, allocator, input, .{});
+ defer parsed.deinit();
- inline for (@typeInfo(T).Struct.fields) |field| {
- if(obj.get(field.name)) |f| {
- const v = @field(T._desc_table, field.name);
- break try decode_json_data(T, v, field, &result, f, allocator);
- }
- } else {
- std.debug.print("Unknown field received in {s} {any}\n", .{ @typeName(T), extracted_data });
- }
-
- return result;
+ return parsed.value;
}
-
/// public decoding function meant to be embedded in message structures
/// Iterates over the input and try to fill the resulting structure accordingly.
pub fn pb_decode(comptime T: type, input: []const u8, allocator: Allocator) !T {
Any help is appreciated.
Hey @librolibro , thanks for your time. I'll look into it ASAP. Any contribution is very much welcome.
UPD: I still have no idea how to solve problems i wrote about earlier, so I switched and almost finished the ManagedString serialization stuff + some tests. It kinda works, but now i have only more questions :)
I added two additional functions to ManagedString struct for std.json to call:
// This one calls when std.json.parseFromSlice will try to deserialize ManagedString struct
pub fn jsonParse(
allocator: Allocator,
source: anytype,
options: json.ParseOptions,
) !@This() {
const string = try json.innerParse([]const u8, allocator, source, options);
// return ManagedString.copy(string, allocator);
return ManagedString.managed(string);
}
// This one calls when std.json.stringify/stringifyAlloc will try to serialize ManagedString struct to string
pub fn jsonStringify(self: *const @This(), jw: anytype) !void {
try jw.write(&self.getSlice());
}
And decode/encode functions:
pub fn json_decode(
comptime T: type,
input: []const u8,
allocator: Allocator,
) !T {
const parsed = try json.parseFromSlice(T, allocator, input, .{});
defer parsed.deinit();
return parsed.value;
}
pub fn json_encode(
data: anytype,
options: json.StringifyOptions,
allocator: Allocator,
) ![]u8 {
return try json.stringifyAlloc(allocator, data, options);
}
Also i wrote some tests:
test "decode_json_with_strings" {
const managed_string = ManagedString{ .Const = "test_string" };
const test_pb_obj = WithStrings{ .name = managed_string };
defer test_pb_obj.deinit();
const test_json_obj = try protobuf.json_decode(
WithStrings,
\\{"name": "test_string"}
,
std.testing.allocator,
);
// NOTE(libro): std.meta.eql sucks here since
// it's comparing strings as pointers
try expect(std.mem.eql(
u8,
test_pb_obj.name.Const,
test_json_obj.name.Const,
));
}
test "encode_json_with_strings" {
const managed_string = ManagedString{ .Const = "test_string" };
const test_pb_obj = WithStrings{ .name = managed_string };
defer test_pb_obj.deinit();
const encoded_string = try protobuf.json_encode(
test_pb_obj,
.{ .whitespace = .indent_2 },
std.testing.allocator,
);
try expect(std.mem.eql(
u8,
encoded_string,
\\{
\\ "name": "test_string"
\\}
,
));
}
Things i wanted to ask about:
Looks like there is 3 ways to make it work for ArrayList:
const std = @import("std");
const ArrayList = std.ArrayList;
const Allocator = std.mem.Allocator;
const json = std.json;
pub fn SerializableArrayList(comptime T: type) type {
return struct {
arr_list: ArrayList(T),
pub fn fromArrayList(arr_list: ArrayList(T)) @This() {
return @This(){.arr_list = arr_list};
}
pub fn getItems(self: @This()) []T {
return self.arr_list.items;
}
pub fn deinit(self: @This()) void {
self.arr_list.deinit();
}
pub fn jsonParse(
self: @This(),
allocator: Allocator,
source: anytype,
options: json.ParseOptions,
) !@This() {
const slice = try json.innerParse([]T, allocator, source, options);
return self.arr_list.fromOwnedSlice(allocator, slice);
}
pub fn jsonStringify(self: *const @This(), jw: anytype) !void {
try jw.write(self.arr_list.items);
}
};
}
const S = struct {
f: SerializableArrayList(u32),
};
pub fn main() !void {
const allocator = std.heap.page_allocator;
var a = SerializableArrayList(u32).fromArrayList(
ArrayList(u32).init(allocator),
);
try a.arr_list.append(167);
try a.arr_list.append(168);
const b = S{ .f = a };
std.debug.print("{any}\n", .{b.f.arr_list.items});
std.debug.print("{s}\n", .{try json.stringifyAlloc(
allocator,
b,
.{.whitespace = .indent_2},
)});
}
Looks ugly and not user-friendly for me because now you always need to get extra field of this dummy wrapper (it only keeps ArrayList and 2 methods for std.json extension) - because of that you'll always need to type struct_var.arr_list.append() which is longer that simple arr_list.append(), or add ALL ArrayList methods to this wrapper (sounds terrible). But it works at least)
What's your opinion on the second option?
P.S. 4th option is to use the arrays ([]T) instead of lists (ArrayList(T)) P.P.S. jsonParse method in example above is bad written one - json.Parsed.deinit frees allocated slice, there should Allocator.dupe call or some sort of ownership interception (if it's possible in Zig standard library)
Hello again, sorry for not looking into it yet, I have limited braincells and time available lately.
This is very interesting, but I'd prefer if we could manage to do something that avoid users having to deal with specific types. ManagedString is already something that annoys me a bit, but it's a necessary evil.
Instead of relying on auto parsing, does the std.json library allows for some kind of token/stream based parsing?
Ideally I'd like that kind of process:
I think that's what I tried on a branch somewhere, but it was a bit of time ago already.
Nevertheless, I'm grateful for your efforts, interest, and time investment. I'll try to look, I might have some time Thursday.
Hi, thanks for feedback. It's not an urgent case for me, just wanted to use your lib in side project which i don't have enough will to start right now anyway)
Yes, std.json internally has a token parser, but it's for general purpose. I guess you want to add another auxiliary field in generated classes, something like _desc_table (or maybe even reuse this one, didn't know about details) - didn't think about it, might be a good idea.
I'm still thinking about 4th option - with ArrayList(T) -> []T replacement and some extra methods for ManagedString it will be done (i hope), but of course it limits user flexibility (if someone needs ArrayList's, there will be lots of toOwnedSlice/fromOwnedSlice calls).
The official python library for protobuf from google has a json format, and it is found in the conformance tests proto definition so this could tie with #27
Considering the zig library has a json parser, it shouldn't be too hard to do.