JuliaGraphs / GraphIO.jl

Graph IO functionality for various formats.
Other
61 stars 28 forks source link

WIP: GraphML MetaGraph support #49

Closed filchristou closed 1 year ago

filchristou commented 2 years ago

as mentioned in #47. This PR enables support for MetaGraphs in GraphML format. It follows the paradigm of http://graphml.graphdrawing.org/ to define attributes in the graph. The following features are supported:

  1. the key/data extension mechanism
  2. default values for keys
  3. loading/saving graphs or Dict of graphs

Although it's functional, I marked it as WIP because I haven't really thought too much about performance and type-stability and more importantly I think some discussion is needed in advance.

Caveats

  1. It introduces a dependency to MetaGraphs
  2. The implementation is done using XPath and DOM and not the EzXML.StreamReader for the reason that there needs to be some memory on the parser in order to remember whose are the properties read
  3. The current API is accesible only from GraphIO.GraphML and not from Graphs. The reason is that now the developer needs also to mention the desired graph type s/he wishes to parse (in this case ::MGFormat).

Discussion I know that currently there is some confusion with the architecture choices of GraphIO.jl. For example there is the problem with require as in #42. I think it's important to find a solution, in order to really take advantage of future contributions.

Regarding graph IO parsing, I see 2 cases:

  1. what file format is used
  2. what graph type must be retrieved

Up until now the 2. was not supported but it is necessary for e.g. parsing MetaGraphs. To support 2. I just added one argument to the loadgraph (which makes it inaccessible just from Graphs.jl) So, we probably also need to add this definition in Graphs.jl. (offtopic)- Wouldn't it make sense to move all the graph IO code from Graphs.jl to GraphIO.jl ?

Personally, I find it unsustainable to put everything in one package. We can't possibly support all graph types, since this will unnecessary grow the dependencies. Starting with MetaGraphs, I find it better to have it in a separate package.

For example I could put the code into something like MetaGraphsIO.jl. And probably we would need to mimic GraphIO.jl and inside the hypothetical MetaGraphsIO support all type formats (GraphML, GML, DOT, etc.). (ofc I am not sure if all type formats support additional attributes like GraphML does, so probably it will be a subset)

Plans In future work I want to include also nested graphs support for GraphML with a custom nested graph type that is still not public. Also, since GraphML attributes are type-secure, this makes it a good fit for MetaGraphsNext which I will also try to support in the future. This means that I could end up creating MyWeirdNestedGraphIO.jl and MetaGraphsNextIO.jl Before I move to it, I would really appreciate some discussion and advices. That said, I think we need to build the infrastructure to welcome future work in this direction.

Sorry for the long text :)

simonschoelly commented 2 years ago

I really have not looked much into it, but instead of creating a separate package called MetaGraphsIO.jl we could also move the functionality here to MetaGraphs.jl. Or maybe this adds to many heavy dependencies into MetaGraphs.jl

Another way would be to use Requires.jl, either here or on MetaGraphs.jl to only load this code, when both packages are used - this has some other downsides of course, it probably makes versioning more complicated, and leads to less precompiled code.

filchristou commented 2 years ago

Another way would be to use Requires.jl,

I though that currently we were searching for ways to get rid of @require. Personally I find it counter-intuitive to do using EzXML in order to have GraphML support. Ofc it would be much more intuitive to do using MetaGraphs to get support for MetaGraphsIO..

Putting MetaGraphsIO functionality inside GraphIO.jl means that maybe future contributions for supporting niche graph types will come. Are we willing to support inside GraphIO.jl all possible graph types that may appear?

we could also move the functionality here to MetaGraphs.jl

I think this scales better but then the question is why does GraphIO.jl even exist in the first place ? With the same mindset it could then just be part of Graphs.jl or SimpleGraphs.jl etc.

I think it will be the most modular solution to have MetaGraphsIO in a separate package. MetaGraphsIO.jl would implement the interfaces defined in GraphIO.jl and would possibly use some tooling from there also. In case we want to leverage MetaGraphsIO.jl from GraphIO.jl we could do in GraphIO.jl something like:

function __init__()
  @require MetaGraphs="...." @reexport MetaGraphsIO
end

MetaGraphsIO.jl would then be a well defined dependency and if I think this correctly even precompiled (not sure ?).

filchristou commented 1 year ago

this functionality is transferred in NestedGraphsIO.jl. NestedGraphsIO.jl is still an unregistred package because #51 is pending.