alanyuchenhou / elephant

MIT License
4 stars 5 forks source link

representation #2

Closed alanyuchenhou closed 8 years ago

alanyuchenhou commented 8 years ago

A neural net typically accepts a vector(n-D array of numbers) as input. Therefore, a process needs to convert a graph into vectors before the neural net does learn and predict: every node, every edge, every property needs to be a vector(fixed dimension).

Adjacency matrix (2-D array) can represent the topology and weights, but the matrix is not invariant for different node ordering and might confuse the neural net.

alanyuchenhou commented 8 years ago

Distributed Representations of Sentences and Documents: problem: defined in its reference papers... approach: convert every word, sentence, paragraph into a vector using neural net, then predict something with some classifier

alanyuchenhou commented 8 years ago

The representation seems to be the key to every door - as soon as we get a good representation method for all things we care about, we can train the neural net to predict many things. The representation should be the same concept as the "feature vector" (which is usually engineered out of the object's raw intrinsic attributes).

For a long time, I thought the most powerful capability of neural net is that it can use the raw intrinsic attributes(e.g., pixels in image recognition #6 , spectrograms in speech recognition #5 ) so that no feature engineering is needed, but now I think its capability of representing things is way more powerful: it can represent concepts: abstract things without intrinsic attributes (like good and evil). These concepts have no intrinsic attributes but have relations with each other (like Jedi are good; Sith are evil; good and evil are opposite; so are Jedi and Sith). In a sense, these concepts are defined by their relations.

Among all the deep learning applications I've seen so far, the most inspiring one is natural language processing, because it uses a very simple and effective approach to represent concepts: every concept is a vector of numbers, a point in the concept space. Most of representations of the same concept are meaningless, providing no information about the concept (like 狗, الكلب, dog, [0, 0, 1, 0, 0...0], etc... these are different representations for a same concept in Chinese, Arabic, English and a one-hot representation). However, a neural net can represent concepts in its concept space so that their representations become meaningful: its relations with other concepts are their geometric relations (like vector(king) - vector(man) + vector(woman) = vector(queen)).

The most interesting part is that the neural net does not get these vector representations from anywhere - it can learn them in a unsupervised way. The only thing we need to show them the relations of these concepts through a large corpus of text (e.g., given: cheetahs move fast; cheetahs have high speed; cheetahs are predators..., the neural net can represent cheetahs, speed, fast, predators with some meaningful vectors).

This capability of concept representation provides a new approach to graph mining for abstract networks where entities cannot be very well represented by its intrinsic attributes. For example, symptoms, diseases and patients can be viewed as concepts in a diagnoses network: they are not defined by their intrinsic attributes, but by their relations. Can a neural net learn good representations for them given sufficient training data? After that, can it do something like given patient, sequence of symptoms, predict the disease? For another example, users, activities, hobbies, advertisements can be viewed as concepts in social network. Can a neural net represent them and predict the advertisement a user will like given his activities?

ghost commented 8 years ago

Good observations. As we discussed in our meeting, with graphs as the input, there is still a challenge for how to represent graphs at a low level (e.g., vector, or sequence of vectors) for input to a neural net. But as long as the representation does not lose too much information from the original graph, the network should be able to learn the complex interaction among features, and features of features, etc., that capture the salient relationships necessary for the classification/prediction task.

alanyuchenhou commented 8 years ago

I guess the current deep learning techniques are ready for that challenge - if it works for natural langues, it should work for all abstract things more or less the same:

  1. we decompose the graph into abstract things: users, activities, advertisements, movies, etc
  2. a neural net learns to map things(high dimension one-hot vectors) to concept points(low dimension feature vectors) in concept space, unsupervised learning
  3. another neural net learns to do the prediction, supervised learning; in case things do have intrinsic attributes, we can consider append these attributes to the feature vector in this stage.
alanyuchenhou commented 8 years ago

I should keep a list of typical entities and their representations(both attributes and relations):

entity attributes (intrinsic, physical) relations (external, abstract)
image 512 x 512 pixel array NA
speech spectrogram sequence NA
word/phrase NA other words and phrases
user NA artist, genre, music
music NA artist, genre, user
protein primary sequence, spacial structure... function, diseases

However, the distinction between attributes and relations can be blurred in some cases, as show in #8 #12 . Specifically, relations with entities of different types can be both viewed as relations and attributes. For example, user U played music m1, m2, m3... n1, n2, n3... number of times respectively has

ghost commented 8 years ago

Ah, I see now what you mean by "relation". You can disregard the comment on issue #8. I generally use relation to describe a relationship between the main entities in the domain. E.g., one song is "similar" to another, or one user is the "sibling" of another user. So, for example with your customer entity in the list above, I would also include hobbies as an attribute of a customer. But sometimes it's not a clear answer as to whether something is a relation or an attribute, especially when there are multiple entity types in the domain (e.g., users, songs, artists).

ghost commented 8 years ago

BTW, good idea to keep this list. Hopefully we'll eventually discern some best practices for how to represent different domains.

alanyuchenhou commented 8 years ago

Thanks! I like the table too. Regarding the terminology system, I like to use a domain agnostic system, too - there're too many domain specific dialects in these papers.

alanyuchenhou commented 8 years ago

Dynamic graphs seem to make distinctions between attributes and relations clear.

Observation

For example, user U rated music m1, m2, m3... r1, r2, r3... has 2 views:

  1. relation view (abstract user): U has relations with m1, m2, m3..., but no attributes;
  2. attribute view (physical user) U has attributes [r1, r2, r3 ...];

However, the 1st one is more stable in dynamic graphs: e.g. when a few new songs show up, the 2nd one requires updating the neural net's architecture, while the 1st one preserve the user representation and does not require updating the architecture.

Conclusion

Don't represent entities directly using their relations to other entities with dynamic total size.

alanyuchenhou commented 8 years ago

@Xiaomi2008 , here is our earlier study on the representation of nodes and edges in a graph. The idea is mostly inspired by NLP with neural net, e.g. word2vec and Distributed Representations of Sentences and Documents.

ghost commented 8 years ago

Found this paper on arxiv relevant to this discussion. "Deep Convolutional Networks on Graph-Structured Data" by Mikael Henaff, Joan Bruna, Yann LeCun. http://arxiv.org/abs/1506.05163.