graph storage format - Githubissues

jovo commented 8 years ago

i believe we decided on the new graph storage format was a pair of files

1) edge list 2) json file, with some spec that disa/will would decide upon and then sign off on?

@icoming @disa-mhembere @willgray @gkiar

wrgr commented 8 years ago

Thank you for the reminder. @disa-mhembere - do you want to have a quick call? Or chat on Monday? @gkiar - arguably this solves our current ndmg issue?

Do we have a spec for an attributed edge list somewhere? I like roughly speaking: (n+1)x(m+2) csv file, where n are the number of edges and m are the attributes (optional). The header row and vertices result in the final (n+1)(m+2) size.

I think Disa has a format already implemented that converts nicely to/from graphml and other formats.

For JSON spec, would be nice to have node ids, node names, edge ids, edge names, and meta data about the graph, to start.

Row 1 -> header Row 2:n -> vertex1, vertex2, attribute1, atribute2, atribute3,...

disa-mhembere commented 8 years ago

This is almost what I had in mind. I think we should chat on monday after the meeting. Sound ok?

On Fri, Feb 5, 2016 at 3:52 PM, William Gray notifications@github.com wrote:

Thank you for the reminder. @disa-mhembere https://github.com/disa-mhembere - do you want to have a quick call? Or chat on Monday? @gkiar https://github.com/gkiar - arguably this solves our current ndmg issue?

Do we have a spec for an attributed edge list somewhere? I like roughly speaking: (n+1)x(m+2) csv file, where n are the number of edges and m are the attributes (optional). The header row and vertices result in the final (n+1)(m+2) size.

I think Disa has a format already implemented that converts nicely to/from graphml and other formats.

For JSON spec, would be nice to have node ids, node names, edge ids, edge names, and meta data about the graph, to start.

Row 1 -> header Row 2:n -> vertex1, vertex2, attribute1, atribute2, atribute3,...

— Reply to this email directly or view it on GitHub https://github.com/neurodata/m2g/issues/215#issuecomment-180550881.

jovo commented 8 years ago

yup

gkiar commented 8 years ago

cross referenced in ndmg

jovo commented 8 years ago

i believe this is finalized. are we doing it now? can i see an example?

for the DARPA talk on 4/4, would i be able to see some benchmarks for this? eg, speed reading/writing, compression reading/writing? or is that not interesting?

disa-mhembere commented 8 years ago

No unfortunately, I have not had time to complete all the interfaces to make this happen. I have parallel ingests working, but nothing is tested on the live services. This will be completed only after supercomputing

neurodata / ndgrutedb

graph storage format #215