Open stensalweb opened 3 years ago
does the order of fields matter or not?
{ 'build': 'b1', 'name': 'p1' }
{ 'name': 'p1', 'build':'b1' };
can json_scanf (s, "%s[name]%s[build]", name, build);
parse the above two strings?
json_scanf is very clever. But If I have a nested json object, can I still use json_scanf?
For example, given the following string
"{ 'project-configurations': [ { 'name': 'p1', 'build': 'b1' }, { 'name': 'p2', 'build': 'b2' } ] }"
Can I still use json_scanf to de-serialize it?
Hello, thank you very much for the feedback! Unfortunately jscon_scanf can't deserialize nested objects, but I plan on making that possible in the future. What you can do instead is deserialize 'project-configurations' with the %ji format, and then access its nests with 'getter' functions, such as:
jscon_item_t *project_configurations;
jscon_scanf(str, "%ji[project-configurations]", &project_configurations);
jscon_item_t *obj1 = jscon_get_byindex(project_configurations, 0);
jscon_item_t *obj2 = jscon_get_byindex(project_configurations, 1);
does the order of fields matter or not?
{ 'build': 'b1', 'name': 'p1' }
{ 'name': 'p1', 'build':'b1' };
can
json_scanf (s, "%s[name]%s[build]", name, build);
parse the above two strings?
Absolutely! The order is unimportant, just as long the arguments match the order of the field, its all that really matters.
It would really great if json_scanf can handle nested json objects. Another question, how the absence of a field is represented in your de-serialized structures?
This https://zserge.com/jsmn can be used to prepreocess the input string buffer to find the start and end of each nested object, but json_scanf will need to be augmented to accept buffer_start and buffer_end.
It would really great if json_scanf can handle nested json objects. Another question, how the absence of a field is represented in your de-serialized structures?
Not sure if thats the answer to your question, but if a field is absent from the string being parsed it will simply be ignored and nothing else.
Maybe it would be better if jscon_scanf returned a integer with the amount of absent fields caught, but the user will still have to check for each argument in order to figure out where the absence occurred. Any suggestions are appreciated!
This https://zserge.com/jsmn can be used to prepreocess the input string buffer to find the start and end of each nested object, but json_scanf will need to be augmented to accept buffer_start and buffer_end.
This is a really interesting approach, I'll look further into it!
It would really great if json_scanf can handle nested json objects. Another question, how the absence of a field is represented in your de-serialized structures?
Not sure if thats the answer to your question, but if a field is absent from the string being parsed it will simply be ignored and nothing else.
Maybe it would be better if jscon_scanf returned a integer with the amount of absent fields caught, but the user will still have to check for each argument in order to figure out where the absence occurred. Any suggestions are appreciated!
my question is "if a field is absent in a JSON string, how do I know whether the field is absent or not in the corresponding de-serialized instance" for example, the two json strings:
{ "field_a": false, "field_b": "this is a string" }
{ "field_b": "this is another string" }
both are de-serialized as instances: a
and b
when I access a->field_a
and b->field_a
, both might be false, but b
actually does not have field_a
. The absence of a field is, in general, different from the field's default value. In Javascript, typeof a.field_a
is boolean, and typeof b.field_a
is undefined, so I can detect field_a
is absent in b
. How do I detect that in an instance de-serialized by json_scanf
?
my question is "if a field is absent in a JSON string, how do I know whether the field is absent or not in the corresponding de-serialized instance"
for example, the two json strings:
{ "field_a": false, "field_b": "this is a string" } { "field_b": "this is another string" }
both are de-serialized as instances:
a
andb
when I access
a->field_a
andb->field_a
, both might be false, butb
actually does not havefield_a
. The absence of a field is, in general, different from the field's default value. In Javascript,typeof a.field_a
is boolean, andtypeof b.field_a
is undefined, so I can detectfield_a
is absent inb
. How do I detect that in an instance de-serialized byjson_scanf
?
Oh I see.. In that example of yours jscon_scanf()
wouldn't be able to turn the json strings into instances a
and b
, for that purpose use jscon_parse()
. The library equivalent is as follows:
jscon_item_t *a, *b;
a = jscon_parse(str1);
b = jscon_parse(str2);
jscon_item_t *item;
item = jscon_get_branch(a, "field_a"); //analogous to a.field_a
jscon_get_type(item); //will return JSCON_BOOLEAN
item = jscon_get_branch(b, "field_a"); //analogous to b.field_a
jscon_get_type(item); //will return JSCON_UNDEFINED
In actuality b.field_a is a NULL pointer since jscon_get_branch couldn't fetch it, and jscon_get_type interprets a NULL pointer as a JSCON_UNDEFINED (from the latest commits).
Now, because your question was regarding jscon_scanf()
I'll assume two possible scenarios.
First scenario the json strings are members of a root object:
{ "obj": { "field_a": false, "field_b": "this is a string" }}
{ "obj": { "field_b": "this is another string" }}
Then you can fetch obj
using jscon_scanf()
with the %ji
specifier, and we would get the same results as the example shown above.
Second scenario you want to fetch field_a directly from the string, in which case we will stick with your original string.
In that case we will use %b
specifiers and parse field_a
value directly to bool
variable.
bool boolean;
jscon_scanf(str1, "%b[field_a]", &boolean);
//boolean is the same as the json string
jscon_scanf(str2, "%b[field_a]", &boolean);
//boolean is not updated
/* the above is not optimal if you
wish to check if the operation was a
success, but there is a second
alternative using %ji as specifier,
though it can be sometimes overkill */
jscon_item_t *item;
jscon_scanf(str1, "%ji[field_a]", &item);
jscon_get_type(item); //returns JSCON_BOOLEAN
jscon_get_boolean(item); //returns same value as the json string
jscon_scanf(str2, "%ji[field_a]", &item);
jscon_get_type(item); //returns JSCON_UNDEFINED
jscon_get_boolean(item); //returns false
Oh I see.. In that example of yours
jscon_scanf()
wouldn't be able to turn the json strings into instancesa
andb
, for that purpose usejscon_parse()
.
The appealing of using jscon_scanf
to de-serialize json strings to C struct instances is that json's fields become C struct's fields, which are first class citizen in C. IDEs can do auto completion for C struct's fields. Using jscon_parse
cannot achieve the same goal.
For the two string example:
str1
{ "field_a": false, "field_b": "this is a string" }
I can do this jscon_scanf(str1, "%b[field_a]%s[field_b]", &obj1->field_a, obj1->field_b)
, can I?
str2
{ "field_b": "this is a string" }
I can do this jscon_scanf(str2, "%b[field_a]%s[field_b]", &obj2->field_a, obj2->field_b)
, can I?
If both are correct, what will the value of obj2->field_a
be?
How to improve jscon_scanf
to encode the information that field_a
is absent in str2 in obj2
?
Oh I see.. In that example of yours
jscon_scanf()
wouldn't be able to turn the json strings into instancesa
andb
, for that purpose usejscon_parse()
.The appealing of using
jscon_scanf
is to de-serialize json strings to C struct instances is that json's fields become C struct's fields, which are first class citizen in C. IDEs can do auto completion for C struct's fields. Usingjscon_parse
cannot achieve the same goal.For the two string example:
str1
{ "field_a": false, "field_b": "this is a string" }
I can do this
jscon_scanf(str1, "%b[field_a]%s[field_b]", &obj1->field_a, obj1->field_b)
, can I?str2
{ "field_b": "this is a string" }
I can do this
jscon_scanf(str2, "%b[field_a]%s[field_b]", &obj2->field_a, obj2->field_b)
, can I?If both are correct, what will the value of
obj2->field_a
be?How to improve
jscon_scanf
to encode the information thatfield_a
is absent in str2 inobj2
?
Ah I get it now, my apologies. That appeal was exactly my motivation on creating that function in the first place! :)
Both examples will work, and you are right that its currently lacking a way to provide the user with useful diagnostics regarding absences, simply ignoring the field isn't a solution and can lead to confusion, at best. Perhaps a new and optional parameter for jscon_scanf()
that receives a char** and fills it with missing keys might do the trick? Though I'm not a fan of that solution.
Anyhow, I'm definitely adding this as a top priority so I really appreciate you bringing this up.
Two missing features can unlock the full potential of jscon_scanf
:
I will poke around for the first one.
A proposal to encode missing keys/properties/fields
struct a {
int key_i;
float key_f;
char * key_str;
void * missing[3]; // 3 is the number of fields.
};
struct a a = {0};
a.key_str = malloc(10);
str = "{ key_f: 1.0, key_str: \"hello\" };"
jscon_scanf(str, a.missing, "%d[key_i]%f[key_f]%s[key_str]", &a.key_i, &a.key_f, a.key_str);
inside jscon_scanf, the following is done to encode the missing key_i.
a.missing[0] = &a.key_i;
This proposed solution does not introduce extra dynamic memory allocation, the assumption is that the number of missing keys is less than the number of existing keys. It's similar to what you proposed. But instead of encoding missing keys, we capture the address of a field that is not initialized in jscon_scanf
.
jscon_item_t *item;
char *string;
bool boolean;
int nested_number;
char buffer[] = "{\"alpha\":[1,2,3,4], \"beta\":\"This is a string.", \"gamma\":true, \"omega\":{\"nest\":1}}";
/* order of arguments doesn't have to be the same as the json string */
jscon_scanf(buffer, "%s[beta] %b[gamma] %ji[alpha] %d[omega][nest]", string, &boolean, &item, &number);
Given the above example, it's better the nest object is de-serialized as well, the following example will be more useful.
jscon_item_t *item;
char *string;
bool boolean;
int nested_number;
struct omega o;
struct {
char * fmt_str;
int number_of_args
char * addresses;
} of;
of.fmt_str = "%s[nest]";
of.number_of_args = 1;
of.address[0] = &o.nest;
char buffer[] = "{\"alpha\":[1,2,3,4], \"beta\":\"This is a string.", \"gamma\":true, \"omega\":{\"nest\":1}}";
/* order of arguments doesn't have to be the same as the json string */
jscon_scanf(buffer, "%s[beta] %b[gamma] %ji[alpha] %o[omega]", string, &boolean, &item, &of);
Inside jscon_scanf,
%o[ometa] will invoke
jscan_scanf(start_buffer_of_omega, of->fmt_str, address[0], address[1], ...); // this has to be done with va_list
This made me realize another issue. Sometimes the useful information is inside arrays, such as:
[{"num": 1},{"num": 2},"string"]
Your %o suggestion works for object, but how will we be dealing with parsing array members without parsing it to a jscon_item_t
?
jscon_item_t *item; char *string; bool boolean; int nested_number; struct omega o; struct { char * fmt_str; int number_of_args char * addresses; } of; of.fmt_str = "%s[nest]"; of.number_of_args = 1; of.address[0] = &o.nest; char buffer[] = "{\"alpha\":[1,2,3,4], \"beta\":\"This is a string.", \"gamma\":true, \"omega\":{\"nest\":1}}"; /* order of arguments doesn't have to be the same as the json string */ jscon_scanf(buffer, "%s[beta] %b[gamma] %ji[alpha] %o[omega]", string, &boolean, &item, &of); Inside jscon_scanf, %o[ometa] will invoke jscan_scanf(start_buffer_of_omega, of->fmt_str, address[0], address[1], ...); // this has to be done with va_list
When dealing with deeper nesting we will be doing 'N nests deep' %o invokes until the desired element is found? How would we fetch 'a' and 'b' from the following example?
{
"alpha": {
"beta": {
"a": 1,
"gamma": {
"b": 2
}
}
}
}
json_scanf is very clever. But If I have a nested json object, can I still use json_scanf?
For example, given the following string
Can I still use json_scanf to de-serialize it?