Closed rohitvarkey closed 7 years ago
Yes I think the counts are also useful. Although that is going to be length nv when only a small fraction of the values are nonzero so the documentation should mention that the counts array is overallocated and on the values up to the first 0 are valid. You could could extract the actual counts with
for k in 1:nv(g)
if counts[k] == 0
break
end
end
counts[1:k]
Actually this is a good use of the two level api. a kcore!(graph, labels, counts)
function which does in place manipulation and a higher level which allocates labels and counts and returns them.
The higher level interface should return counts[1:k]
but the low level interface should return labels and counts without dropping the 0s so that counts can be reused.
I was trying to follow the same semantics we offered in bfs
. The plan was to make the application keep track of the number of vertices that have been inserted so far. Providing that number would allocate the optimal amount of memory.
function kcore(s::Stinger, nv::Int64)
labels = zeros(Int64, nv)
counts = zeros(Int64, nv)
....
In these cases, we don't need to run a loop to extract the counts
.
counts array is overallocated and on the values up to the first 0 are valid. You could could extract the actual counts with
This can lead to issues if the user makes the mistake of starting vertex IDs with 1 instead of 0, as the first element in count will be a 0 and an empty Array will be returned. I think that is quite confusing from the perspective of a user.
I do agree that a two level API will be useful. But the levels I'd propose are:
Higher level - function kcore(s::Stinger, nv::Int64)
- This allocates the labels
and counts
as the specified number of vertices.
Lower level - function kcore!(s::Stinger, labels::Array{Int64}, counts::Array{Int64})
- This performs the ccall
and populates labels
and counts
.
We could also have another function kcore(s::Stinger)
which overallocates with the maximum number of vertices possible, calls the lower-level API and then performs the extraction of counts and labels as proposed.
I agree your design is correct.
I decided to make a call to get_nv
and find the maximum active vertex first to allocate only the required memory, rather than allocating the memory for the maximum vertices for labels and counts and then iterating through that to find out the max and returning only that part.
K-core implementation which basically just calls the STINGER
kcore_find
function and returns thelabels
array.@jpfairbanks Should I return the
counts
array also here?