Closed hhromic closed 7 years ago
Hi Hugo, there is a development version of SNAP that uses 64 ints for node and other ids and can deal with graphs with more than 1.5 billion nodes: https://github.com/snap-stanford/snap-dev-64
Hi @blazf ,
Thanks for your answer. I didn't know about that effort and went to check it. Indeed it seems to use TUInt64
instead of TInt
, which is good news. However I noticed that it is still using signed integers. I understand that this is mostly to support using negative ids for marking nodes as invalid or "not initialised". While this is a valid design choice, I think the cost of sacrificing half of the integer range for ids is too high for this purpose, as there are other alternatives. For example using a "valid" flag.
In my opinion this is an unnecessary limitation over the ids, but I understand it is not something easy to change or challenge. I think working out this limitation would be important for the scalability of the library. I would gladly help investigating this but I have zero knowledge of the inner workings of SNAP, hence I don't fully understand the design choice of signed integers for ids.
Thanks anyway! Hugo.
Hi ! I've been using SNAP for a while until recently that I "discovered" that the library uses signed 32-bits integers for Node Ids (
TInt
). This introduces an important limitation for my research because I'm using 64-bits unsigned integers for user ids (i.e. user ids from Twitter). For now I resorted to using a re-indexing + mapping strategy which costs memory and a bit of performance for bi-directional lookups.Are there any plans in near future to move to unsigned 64-bits integers for node ids? I checked the source code and SNAP seems to rely on negative integers and in some parts id arguments are hardcoded to
const int&
(instead ofTInt
or something better likeTNodeId
). So I see this change might not be as straight-forward and I understand that.I think further abstracting the node id type (i.e. a
typedef
) and not depending on negative numbers would open the full range of numbers assignable and thus make the library more powerful for big graphs. Any comments are highly appreciated!.Thanks!