Shiyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C. Aggarwal, and Thomas S. Huang
Published
KDD 2015
Why
This paper has introduced how to transform the input data in a highly non-linear manner and apply the method to the heterogeneous network, which has a wide variety of nodes and edges. Non-linearity is crucial to embedding the data into a lower vector space effectively.
What
This paper uses deep learning to capture the features properly on a heterogeneous network.
How
Not using deep learning
The goal is to project a heterogeneous network into a common latent space.
For clarity, suppose the heterogeneous network, which has nodes as pictures and texts.
First of all, we think about how to abstract the meaningful features from the categories, pictures, and texts. We use non-linear way of the transformation as follows.
x denotes the vector of pictures and z denotes the vector of texts.
Similarity measures are defined as follows.
What we should do next is compute the loss function and minimize it.
A is adjacency matrix and d is similar to the similarity measures.
Once we can set the loss function, we define the cost function while thinking about each loss function.
N is the number of data. λ is the parameter governing the adjustment of the values.
See more details on paper.
Using deep learning
We use deep learning to map the heterogeneous data into a common space. Deep learning allows us to extract more complicated features and get a higher performance. On pictures, we use CNN, while on texts, we use tf-idf.
About
Author
Shiyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C. Aggarwal, and Thomas S. Huang
Published
KDD 2015
Why
This paper has introduced how to transform the input data in a highly non-linear manner and apply the method to the heterogeneous network, which has a wide variety of nodes and edges. Non-linearity is crucial to embedding the data into a lower vector space effectively.
What
This paper uses deep learning to capture the features properly on a heterogeneous network.
How
Not using deep learning
The goal is to project a heterogeneous network into a common latent space. For clarity, suppose the heterogeneous network, which has nodes as pictures and texts. First of all, we think about how to abstract the meaningful features from the categories, pictures, and texts. We use non-linear way of the transformation as follows. x denotes the vector of pictures and z denotes the vector of texts. Similarity measures are defined as follows. What we should do next is compute the loss function and minimize it. A is adjacency matrix and d is similar to the similarity measures.
Once we can set the loss function, we define the cost function while thinking about each loss function. N is the number of data. λ is the parameter governing the adjustment of the values. See more details on paper.
Using deep learning
We use deep learning to map the heterogeneous data into a common space. Deep learning allows us to extract more complicated features and get a higher performance. On pictures, we use CNN, while on texts, we use tf-idf.