Packed Compressed Sparse Row (PCSR)

marvin-hansen commented 1 year ago

Any aspirations to add PCSR, maybe as an optional graph representation?

From the article: "...PCSR was orders of magnitude faster for inserts and updates than CSR and adjacency list while maintaining similar graph traversal times."

http://supertech.csail.mit.edu/papers/WheatmanXu18.pdf

PCSR Code: https://github.com/wheatman/Packed-Compressed-Sparse-Row

IMHO, space (memory) is easier to solve than the time it takes to generate a graph from a large dataset.

I'm generating currently a number of hyper graphs that usually reach between half a million and 2 million vertexes & edges. My current worst case runtime is in the ballpark of 15 minutes. These graphs are based on 10% data sample size, the non-sampled data would be more in the ballpark of 30 to 50 million vertices and edges per graph hence the insert time becomes a bit more critical.

Thoughts?

yamafaktory commented 1 year ago

Hey @marvin-hansen ,

Thanks for the feedback!

First of all I'm glad to hear that the project is actually used.

Currently this is how things are implemented internally https://github.com/yamafaktory/hypergraph/blob/38bc981a5d95776d2e46b5c4e723a2374a7820e1/src/core/mod.rs#L65 (IndexMap and IndexSet).

I'll have a look at the paper and the code you've shared. It's very likely a complete different approach.

marvin-hansen commented 1 year ago

Thank you @yamafaktory

also thank you for updating the hasher to aHash, it is indeed a bit faster.

I actually build an entire prototype atop of hypergraph to test out some complex spacetime modeling.

That said, the IndexMap & IndexSet implementation has become quite a bottleneck for graphs > 1 million nodes. However, my vertex objects are relatively large so I am working currently on keeping the graph to track relations and move all vertex objects into a separate in-memory object store to see if that performs better.

However, I acknowledge that I am working on a relative corner case as my models tend to grow large.

And yes, the packed CSR implementation is very different as it relies on a modified sparse matrix representation internally.

Not sure if its feasible to implement as an alternative to the existing IndexMap, but would certainly expand the applicability of your hyper-graph crate to use cases with billions of nodes without bumping into performance or memory issues.

yamafaktory commented 1 year ago

@marvin-hansen thanks for the feedback!

Indeed performance is something we should care about. I'm definitely interested in your experiments.

I'm keeping this issue open and will try to figure out ways for improvement.

marvin-hansen commented 1 year ago

@yamafaktory Thank you for your consideration.

I've created some basic benchmark to illustrate the issue. It is not scientifically sound, but confirms & isolates what I see in my application. With more nodes & adding edges that are about 5 x the number of nodes, runtime gets exponentially worse. I've excluded this scenario because it's not fun to measure that runtime.

https://github.com/marvin-hansen/rust_graph_bench

When adding 5 million nodes, Hypergraph takes about 2 seconds whereas a CSR graph implementation does the same in under 30 milliseconds. If the claims in the publication are true, the PCSR implementation should add 5 million nodes in about 5 - 7 ms.

It is worth mentioning that memory usage goes up quite heavily when adding 50 million nodes. Beyond 50 million, Hypergraph cannot handle it on a normal machine and runs out of memory with >20 GB allocated.

That gives a rough approximation of the scope & magnitude of the problem.

It might be fair to add a note to the Readme that this crate is really good for graphs up to a million nodes, but for very large graphs, people better benchmark their use-case to determine if its the right solution.

yamafaktory commented 1 year ago

Hi @marvin-hansen

Thanks for the feedback!

There are some benches in the repo but none regarding insertion.

The main difficulty is that I need a way to refactor the project to optimize performance while keeping in mind that this is about hypergraph not a normal graph implementation - i.e. PCSR is about graph only. I'll try to figure that out soon.

marvin-hansen commented 1 year ago

@yamafaktory

AFAIK, a hypergraph can be mapped to a bi-parite graph and that means there already is an adjacency matrix representation. The graphics below shows how to map N vertices to M edges. The big question though, is how to adapt (P)CSR in such a way that it fits that model.

https://www.semanticscholar.org/paper/Evaluation-and-Improving-Hypergraph-Based-Learning-Augusty-Izudheen/5fb74e5a43a5d6d2dcb58bf0ef0065b009bebb0c/figure/0

yamafaktory commented 1 year ago

@marvin-hansen thanks for the link.

Indeed, but this is no longer a directed hypergraph. I'll dig into that topic.

yamafaktory commented 1 year ago

@marvin-hansen one important point: you should use with_capacity instead of new for your benches. This should make a huge difference by avoiding reallocation.

marvin-hansen commented 1 year ago

@yamafaktory

Thank you, I updated the benchmark. A few observations. 1) Hypergraph runs about 15% faster (50 to 60 ms) with pre-allocation. 2) MatrixGraph runs a full order of magnitude slower with pre-allocation indicating a serious bug in the implementation. 3) The CSR implementation doesn't have a with_capacity constructor.

The main take away really is that the initial assessment largely remains the same, but I have to investigate what went wrong with MatrixGraph & probably fill a bug report.

yamafaktory / hypergraph

Packed Compressed Sparse Row (PCSR) #41