snap-stanford / snap

Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library.
Other
2.18k stars 799 forks source link

Seed node2vec random walks from clock instead of current time. #236

Open phoeinx opened 1 year ago

phoeinx commented 1 year ago

Not 100% sure how the contributing guidelines for this repository work, so please forgive me if this is not the exact right way.

This PR would already be superseded by #191 from @maxaalexeeva which would allow passing in a seed as a parameter to node2vec. In the long run, I think @maxaalexeeva s PR would benefit both people needing node2vec to be deterministic and those wanting it to be consistently non-deterministic. This PR introduces a minor change that makes node2vec consistently non-deterministic by using a different seeding number generation, which is already used in other places throughout the project.

Calls to time(NULL) return the current calendar time in seconds since 1 January 1970. Node2vec runs that are started in the same second (e.g. when automated) have the same seed and therefore the same embedding. To enable comparing different embeddings when running automated scripts this commit introduces seeding based on TSysTm::GetPerfTimerTicks(), as e.g. already used in randwalk.cpp.