ehab-abdelhamid / GraMi

GraMi is a novel framework for frequent subgraph mining in a single large graph, GraMi outperforms existing techniques by 2 orders of magnitudes. GraMi supports finding frequent subgraphs as well as frequent patterns, Compared to subgraphs, patterns offer a more powerful version of matching that captures transitive interactions between graph nodes (like friend of a friend) which are very common in modern applications. Also, GraMi supports user-defined structural and semantic constraints over the results, as well as approximate results. For more details, check our paper: Mohammed Elseidy, Ehab Abdelhamid, Spiros Skiadopoulos, and Panos Kalnis. GRAMI: Frequent Subgraph and Pattern Mining in a Single Large Graph. PVLDB, 7(7):517-528, 2014.
Other
107 stars 39 forks source link

what does the paper mean by 'a single large graph'? #5

Closed mountain3th closed 9 years ago

mountain3th commented 9 years ago

hi, Ehab.You do a excellent job in this work.I appreciate that.Now i just have a confusion when i don't know what kind of subgraphs will be mined.Here i give some examples. ie.1

t 1

v 0 1 v 1 2 v 2 1 v 3 2 e 0 1 1 e 2 3 1

(This got a right result) 0.122 1 0: v 0 1 v 1 2 e 0 1 1

ie.2

t 1

v 0 1 v 1 1 v 2 2 e 0 2 1 e 1 2 1 (And this got nothing)

I expect the result looks like v 0 1 v 1 2 e 0 1 1.

Could you help me?

ehab-abdelhamid commented 9 years ago

Hi,

Thanks for your interest in our work. I assume you are using support threshold =2. If so, then yes first example shall give you 1 frequent subgraph. And yes, for the second example there is no frequent subgraphs. The reason is that we use the MNI support computation metric, which ignores some overlaps. In the second example, the two occurrences overlap in v2 so they are counted as 1 occurrence.

You can find more discussion about this in our VLDB paper:

http://www.*vldb*.org/p*vldb*/vol7/p517-elseidy.pdf

Regards, Ehab On Mar 8, 2015 12:55 PM, "Sam Xie" notifications@github.com wrote:

hi, Ehab.You do a excellent job in this work.I appreciate that.Now i just have a confusion when i don't know what kind of subgraphs will be mined.Here i give some examples. ie.1 t 1

v 0 1 v 1 2 v 2 1 v 3 2 e 0 1 1 e 2 3 1

(This got a right result) 0.122 1 0: v 0 1 v 1 2 e 0 1 1

ie.2 t 1

v 0 1 v 1 1 v 2 2 e 0 2 1 e 1 2 1 (And this got nothing)

I expect the result looks like v 0 1 v 1 2 e 0 1 1.

Could you help me?

— Reply to this email directly or view it on GitHub https://github.com/ehab-abdelhamid/GraMi/issues/5.

mountain3th commented 9 years ago

Hi Ehab, Thanks for answering me so fast.But i still keep a question that what if i want to find the friends that both A and B have.How to overcome the overlap in a condition when we use in social network.

Best Regards, Sam

ehab-abdelhamid commented 9 years ago

Hi,

In frequent subgraph mining (FSM), it is not about finding specific occurrences of a pattern, it is more about knowing which patterns are frequent. So, we do not care about the occurrences itself.

The reason why people use metrics like MNI, is that in FSM we need to use a metric that is anti-monotonic, and this can be assured by ignoring overlapping occurrences.

If you are looking for specific matches of a given query, I advise you to use a subgraph matching algorithm rather than an FSM algorithm.

Regards, Ehab

On Sun, Mar 8, 2015 at 1:39 PM, Sam Xie notifications@github.com wrote:

Hi Ehab, Thanks for answering me so fast.But i still keep a question that what if i want to find the friends that both A and B have.How to overcome the overlap in a condition when we use in social network.

Best Regards, Sam

— Reply to this email directly or view it on GitHub https://github.com/ehab-abdelhamid/GraMi/issues/5#issuecomment-77742821.

mountain3th commented 9 years ago

Hello, Maybe i didn't ask clearly.Well, if a graph is composed of subgraphs like A->C,B->C,...which means A was born in C,and B was born in C.How can i mine this subgraphs cause it overlaps. Is there any solutions to that?

Best wishes, Sam

ehab-abdelhamid commented 9 years ago

You can mine that graph. Let me give an example to illustrate what I mean. Assume the following is your graph: v 0 1 v 1 2 v 2 1 v 3 3 v 4 3 v 5 2 e 0 1 1 e 2 1 1 e 3 1 1 e 4 3 1 e 4 5 1

and the support threshold = 2 then, the following subgraph S1 is frequent: v 0 3 v 1 2 e 0 1 1

but the following subgraph S2 is not frequent: v 0 1 v 1 2 e 0 1 1

This is because when we search for the occurrences for S1, we find the following two: v 3 3 v 1 2 e 3 1 1

and

v 4 3 v 5 2 e 4 5 1

and they are not overlapping (no common nodes between them), so each occurrence adds to the support, so support = 2 and S1 is considered as frequent

As for S2, we find the following occurrences:

v 0 1 v 1 2 e 0 1 1

v 2 1 v 1 1 e 2 1 1

Although we find two occurrences, but they overlap at node with id=1, so they are considered as one occurrence. Hence, support = 1 and S2 is considered as infrequent.

I hope this clearly describe it.

Regards,

On Sun, Mar 8, 2015 at 2:21 PM, Sam Xie notifications@github.com wrote:

Hello, Maybe i didn't ask clearly.Well, if a graph is composed of subgraphs like A->C,B->C,...which means A was born in C,and B was born in C.How can i mine this subgraphs cause it overlaps. Is there any solutions to that?

Best wishes, Sam

— Reply to this email directly or view it on GitHub https://github.com/ehab-abdelhamid/GraMi/issues/5#issuecomment-77744093.

mountain3th commented 9 years ago

Hello Ehab, Though i finally find out that this method can not mine the sub-graph that overlaps, i appreciate for your help.BTW, have you ever studied in knowledge graph which is asked to mine the frequent pattern from a big graph like Dbpedia or freebase.Once we finish it,we can know the information such like the whole of people who was born in Athens.And that i think would use the FSM or FPM. Any ideas? Hope for your reply.

Best wishes, Sam

ehab-abdelhamid commented 9 years ago

Hi Sam,

If you can convert it to the same formal (.lg) that is used in GraMi, then it is ok. Once you convert those graphs, then just apply GraMi and you will get the frequent patterns.

Best, Ehab

On Sun, Mar 8, 2015 at 5:25 PM, Sam Xie notifications@github.com wrote:

Hello Ehab, Though i finally find out that this method can not mine the sub-graph that overlaps, i appreciate for your help.BTW, have you ever studied in knowledge graph which is asked to mine the frequent pattern from a big graph like Dbpedia or freebase.Once we finish it,we can know the information such like the whole of people who was born in Athens.And that i think would use the FSM or FPM. Any ideas? Hope for your reply.

Best wishes, Sam

— Reply to this email directly or view it on GitHub https://github.com/ehab-abdelhamid/GraMi/issues/5#issuecomment-77751622.