rapidsai / cuhornet

BSD 3-Clause "New" or "Revised" License
25 stars 25 forks source link

Unable to remove duplicated edges in batchUpdate #64

Open ax7e opened 1 year ago

ax7e commented 1 year ago

Thanks for your great work! I read throw the batchUpdate functions. I'm confused about the following facts:

  1. Does hornet keeps the nodes inside each chunk sorted? If so, why batch update will finish with out furthur sorting?

https://github.com/rapidsai/cuhornet/blob/ab70d14a562bdaa950e820b412fe827c570c0ca3/hornet/include/Core/BatchUpdate/BatchUpdate.i.cuh#L873-L902

In the above code, the writer seems to assume that the edges are not sorted.

https://github.com/rapidsai/cuhornet/blob/ab70d14a562bdaa950e820b412fe827c570c0ca3/hornet/include/Core/HornetOperations/HornetInsert.i.cuh#LL12C1-L27C2

The code do don't sort the edges after batch update.

  1. If the edges are don't sorted, actually the remove_graph_duplicates function is wrong. https://github.com/rapidsai/cuhornet/blob/ab70d14a562bdaa950e820b412fe827c570c0ca3/hornet/include/Core/BatchUpdate/BatchUpdateKernels.cuh#L56-L59

It uses xlib::lower_bound_left, which requires the array to be sorted.

  1. More Importantly, the delete function also assumes the chunks are sorted, but they are not.

https://github.com/rapidsai/cuhornet/blob/ab70d14a562bdaa950e820b412fe827c570c0ca3/hornet/include/Core/BatchUpdate/BatchUpdateKernels.cuh#L293-L296C6

ax7e commented 1 year ago

@BradReesWork @bdice

bdice commented 1 year ago

Hi, unfortunately I won't be much help here. I have done some packaging work for this library but am unfamiliar with its internals. I believe this library is also scheduled for deprecation? cc: @rlratzel @ChuckHastings

ChuckHastings commented 1 year ago

@ax7e - thanks for your interest.

This repository is a copy of the original hornet repository from Georgia Tech (https://github.com/hornet-gt/hornet). We copied the original repository so that we could use a handful of features from this library and integrate them into the NVIDIA RAPIDS cugraph library (https://github.com/rapidsai/cugraph). At the moment, the only remaining use of the hornet code base within cugraph is the implementation of k-truss).

Our support for this copy over the last 4 years has entirely been limited to updating things to compile for the subset of algorithms that we are using... which is now only k-truss.

The cugraph library will be phasing out use of cuhornet later this year and will drop support for this. I don't believe that Georgia Tech is actively working on this library either (at least I see no updates to the repository since 2019). @ogreen might have some insight into that.

If you are interested in using cugraph and/or collaborating with us, I certainly encourage that. We have an active team of developers working on the cugraph library. The big thing that hornet offers that cugraph does not is support for dynamic graphs. We do intend to add support for dynamic graphs (it is on our long-term road map), but our work at the moment focuses on static graph analysis and Graph Neural Networks.

If dynamic graphs are important to you, you can reach out to us and we'd be happy to offer suggestions on how to make use of cugraph, and perhaps you can help inform our plans for adding dynamic graph support in the future.

ax7e commented 1 year ago

@ChuckHastings Thanks for your comment!. I'm truely very interested in dynamic graphs. It poses a unique challenge on load balancing and memory coaleasing. With the rapid development of LLM, the GPU memory is becoming sufficiently big that it could fits amost all graph streaming workloads. I'm truely happy to discuss about what I can do in adding dynamic graph support in the future.