nncarlson / yajl-fort

YAJL-Fort: A modern Fortran interface to the YAJL library
MIT License
9 stars 1 forks source link

An experimental JSON data container #1

Open nncarlson opened 8 years ago

nncarlson commented 8 years ago

With regularity I am approached by someone looking for a Fortran solution for reading JSON data from a file, presumably into some container where the data can be randomly accessed, and who are baffled with how to use YAJL-Fort for that purpose. The confusion is understandable because YAJL-Fort isn't that solution; it is only intended to be a piece of that solution. As described in the README, it is an event-driven (SAX style) parser that invokes client-provided callback procedures when specific tokens, like '{', ']', or a , are parsed. It is up to the client to do something useful with those bits of data. (Of course it is really the YAJL C library that does all the heavy lifting; YAJL-Fort is merely an interface to the library.) What is needed are example use-cases of YAJL-Fort. The included example that simply echoes the JSON input is really too simple to provide much guidance. The Petaca library provides a serious example in the code that populates a "parameter list" container from JSON data, but it is perhaps too complex because the container is only compatible with a subset of the JSON format, and a significant part of the callbacks are concerned with ensuring that the JSON data belongs to that subset.

With this in mind, I have written an experimental module, json.F90, that defines a container for arbitrary JSON data and provides procedures for populating the container with JSON data read from a file or a string. The procedures provide a good example of how to use YAJL-Fort. But beyond that, this module is the beginnings of a full solution for reading, writing, and working with arbitrary JSON data1. In this regard it is very similar to JSON-Fortran, though with much more limited functionality at this point.

By default the json.F90 module and its associated tests are not compiled. Use the cmake option -DENABLE_JSON=ON to build them. The tests/examples are located in test/json.

Your comments/suggestions are solicited.


1 General JSON values do not actually mesh all that well with natural Fortran data structures like arrays (rectangular and of homogeneous type) and this makes working with general JSON data rather awkward. If the real need is for a JSON-like hierarchical data container with values that match Fortran data structures, then Petaca's "parameter list" is what you want; it is expressible as JSON, but it is much easier to work with.

zbeekman commented 8 years ago

Very interesting. I'll have to take a better/closer look at some point. I'm very intrigued by your Petaca library and it looks like it could replace most of my usage of JSON-Fortran, potentially. (Although, IIRC it lacks array support? Maybe I'm mistaken here...)

nncarlson commented 8 years ago

The design of the data types mirror the json grammar described at http://www.json.org (with refinements to adapt to the yajl parser that differentiates a number as either an integer or float). The module went through several iterations, but in the end I was taken aback at how few executable statements there were; it's mostly type and procedure specification.

It does support JSON arrays, but those aren't the same thing as Fortran arrays, which I know is what you meant. Adding a method to extract data in a Fortran array when the JSON array is conformable is certainly possible. But I'd step back and consider (for a particular application) who should be in the driver seat: is it the JSON format or is it the ultimate data container? Or asked another way, is the application processing arbitrary JSON data, or is JSON being used to express application-defined input? If it is the latter, then I think something different, like parameter_list from Petaca, is what is really wanted; something that defines the data container the application really wants to work with and then uses (a subset of) JSON to populate it. Using parameter_list as an example, '[[1,2],[3,4]]' is a valid array value but '[[1,2],"foo"]', which is a valid JSON array, is not and gets rejected at parse time. With the alternative that supports general JSON, one doesn't know there is a problem until one accesses the data, which is later than ideal.