gephi / gexf

GEXF Format Specifications
https://gexf.net/
Creative Commons Attribution 4.0 International
31 stars 6 forks source link

Modify array strings #8

Open mbastian opened 8 years ago

mbastian commented 8 years ago

Something I forgot to mention, we've modified the array support. The 1.2 format supports something called a "liststring" which is basically an array of strings and would be able to parse something like foo|bar where | is the separator. This wasn't very well done and @eduramiba improved the underlying array parsing and printing code so we can parse things like [foo, bar], foo,bar or even ["foo", "bar"]. We've also added new types: listboolean, listinteger, listlong, listdouble.

duncdrum commented 8 years ago

ah i was already scratching my head about this one. [foo, bar] seems most xml-esque to me. See here. Just to be clear these can appear as xml-elements <attvalue>[foo, bar]</attvalue>, or xml-attributes <edge weight="[1.0, 1.5]"/>, or both? Should we maintain the 1.2 use of the union operator? I would be worried that in some parser foo | bar | foo would return ["foo", "bar"]?

mbastian commented 8 years ago

Yes, <attvalue>[foo, bar]</attvalue> works but not <edge weight="[1.0, 1.5]"/> as the weight has to be a number. Basically only attvalue or the default value of attribute. We've actually already dropped out the support for the union separator as it was a pain to make that configurable so we should just stick to commas.

Yomguithereal commented 9 months ago

Sorry to barge in on a 2016 issue but I am only noticing now this new format of list attribute values because some people need it in graphology and I must say I am a little bit surprised it is so fuzzy and not just a well-known format such as a json array. You basically need to implement a peculiar string literal parser (the possible absence of quotes being the most damning) to support all the possible cases here, including legacy files. What's more the specification does not really say if you can mix and match nor what to support regarding escaping within the string literals (hex, octal notation etc. or such).

Should we maybe restrict this a bit more?