dbpedia / GSoC

Google Summer of Code organization
37 stars 27 forks source link

Explainable Knowledge Discovery on DBpedia #33

Closed mommi84 closed 4 years ago

mommi84 commented 5 years ago

Description

The latest DBpedia release comprises a knowledge graph having 326,035,765 edges. As the data originates from a human-generated wiki, the graph is far from being complete. This project idea aims at the realisation of an algorithm to perform knowledge base completion (or link prediction) on DBpedia. We want to tackle this problem using a rule-based approach in order to meet the requirements for an explainable AI.

Goals

The goals of the candidate would be to adapt the code of and employ HornConcerto, an algorithm to discover Horn rules and new relationships in large graphs. The method has already shown to outperform existing approaches in terms of runtime and memory consumption and mine high-quality rules for the completion task.

Ultimate goals:

Impact

The project will enhance the quality and completeness of DBpedia data.

Warm-up tasks

Mentors

TBD (possible names: Tommaso Soru, Aman Mehta, Amandeep Srivastava).

Keywords

knowledge discovery, knowledge base completion, link prediction, explainable artificial intelligence, association rules

g-laz77 commented 5 years ago

Hi, I would like to work on this project. From what I understand it looks like an exploratory project where I have to analyse sets of data to recognize patterns(rules). Will this also involve working on any ML algorithm to learn the pattern?

mommi84 commented 5 years ago

Hi @g-laz77 and thanks for your interest.

Yes, the proposed algorithm HornConcerto just performs a basic mining of rules and their confidence values. It can definitely be improved with ML and I recommend to include it in the project proposal. Did you think of any ML algorithm in particular?

g-laz77 commented 5 years ago

@mommi84 I have gone through the paper Beyond Markov Logic: Efficient Mining of Prediction Rules in Large Graphs (2018). I have understood how the basic mining of rules is done by calculating the confidence score of a said rule. This is done by simply checking the co-occerence of that rule with the top P properties(their highest occurence counts) to get the rule support and dividing it with the body support(of the P properties).

It occurred to me to use a pattern detection ML algorithm to mine more rules. LinkNBed: Multi-Graph Representation Learning with Entity Linkage. Something like this?

mommi84 commented 5 years ago

@g-laz77 The LinkNBed paper does indeed target also link prediction using representation learning. However, it is well known that deep learning models often lack the possibility to explain their own predictions. How do you expect to address this problem?

g-laz77 commented 5 years ago

@mommi84 I agree. Deep learning models have the problem of model interpretebality. Although, I recently came across this library in python called Skater, which is a unified framework to enable Model Interpretation. Since this is an exploratory project, would it not serve the purpose along with using hornConcerto to mine more rules?

mommi84 commented 5 years ago

@g-laz77 It sounds worth exploring to me. Please elaborate how you would proceed in a Google doc and share it with my GitHub handle at gmail dot com.

KaiyuanZh commented 5 years ago

Hi, I have gone through the paper Beyond Markov Logic: Efficient Mining of Prediction Rules in Large Graphs. I have previously worked on two papers related to graphs in C++ and python. Now I am working on a research project related to adversarial learning on graphs. So it would be a great opportunity for me to work on this project. What should I do next if I want to start contribution to this project?

Saichethan commented 5 years ago

Is this still open I want to be mentor how can I know if I am eligible

mommi84 commented 5 years ago

@ky-zhang Hi and thanks for your interest! Can you please post the links to the two papers? If you have completed the warm-up tasks, the next step would be to outline a possible integration of adversarial learning or any other ML technique with the project. Prepare a draft and invite my GitHub handle at gmail dot com.

@Saichethan Hi! You can mentor a project in one of the following cases: 1. you have already been a DBpedia GSoC mentor or student; 2. you have already been a GSoC mentor for another organisation; 3. you already have published research experience in the field.

KaiyuanZh commented 5 years ago

@mommi84 Thank you for your reply. The papers haven't been published, I will send you an email with the draft papers and a pre-proposal for the project.

g-laz77 commented 5 years ago

@mommi84 I have sent you an email with a starter plan for the proposal around a week ago. Have you received it?

mommi84 commented 5 years ago

Yes, @g-laz77. I've forwarded it to the other mentors and we will get back to you tomorrow.

g-laz77 commented 5 years ago

@mommi84 I have completed the proposal. Kindly review it