phanein / deepwalk

DeepWalk - Deep Learning for Graphs
http://www.perozzi.net/projects/deepwalk/
Other
2.69k stars 825 forks source link

Has the data set BlogCatalog been processed? #34

Closed Junshuai-Song closed 7 years ago

Junshuai-Song commented 7 years ago

Has the data set BlogCatalog been processed?

The data set BlogCatalog here is different from the site: http://socialcomputing.asu.edu/datasets/BlogCatalog3, which is used in Node2vec.

Junshuai-Song commented 7 years ago

The label attributes in these two data sets are different(the network seems the same).

GTmac commented 7 years ago

We use the original dataset -- the only difference is that the indexing of nodes and groups starts from 0, instead of 1.

For example, the original nodeId-groupId pair (1, 21) corresponds to (0, 20) in the mat file we provide.

phanein commented 7 years ago

I haven't looked at the node2vec file, but we used the dataset that Lei Tang used for his social dimension work, in order to facilitate comparisons.

[ available at http://leitang.net/social_dimension.html ]

On Mon, Apr 10, 2017 at 1:04 PM, Haochen Chen notifications@github.com wrote:

We use the original dataset -- the only difference is that the indexing of nodes and groups starts from 0, instead of 1.

For example, the original nodeId-groupId pair (1, 21) corresponds to (0, 20) in the mat file we provide.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/phanein/deepwalk/issues/34#issuecomment-293013900, or mute the thread https://github.com/notifications/unsubscribe-auth/ABEo0W6hUMk8Cbty52srGo8efojunswfks5rumEegaJpZM4M4stv .

Junshuai-Song commented 7 years ago

Thank you for your reply! I have found the reason. I directly changed the file suffix (.csv -> .txt, downloaded from http://socialcomputing.asu.edu/datasets/BlogCatalog3), and it caused some mistakes.