dbpedia / GSoC

Google Summer of Code organization
37 stars 27 forks source link

Pay-As-You-Go Quality Evaluation of the DBpedia Resources #35

Closed beyzayaman closed 4 years ago

beyzayaman commented 5 years ago

Description

Extraction of the triples from unstructured sources causes several quality problems. These problems causses wrong results in the information retrieval systems. Nonetheless, the quality has both subjective and objective point of view. While some quality dimensions can be assessed using generic tools (objective), others need crowd-source evaluation of the resource (subjective). This project aims at pay as you go quality evaluation of the DBpedia resources in the information retrieval setting, taking into account both subjective and objective quality dimensions. User will request a query by a quality threshold for the specific dimension (trust, freshness etc.). For the given query from the user, the results will be shown (black box) and feedback from the user will be received. According to the feedback, the quality graph of the resource will be updated w.r.t. the given quality dimension.

Goals

The goals of the candidate are as follows:

Impact

The project will provide a system that computes data quality in an pay-as-you manner and provides structured data graphs for the resources.

Warm up tasks

Mentors

TBD (possible names: Beyza Yaman).

Keywords

Quality, Feedback, Crowd-source, Information retrieval

ayush-anand-13 commented 5 years ago

Sir, I've started reading Quality Assessment Methodologies for Linked Open Data and understanding the code base. I've always wanted to work on a project with real world applications. Would like to proceed with this project. Can I ask my questions and doubts here as I progress?

beyzayaman commented 5 years ago

Yes, please do as long as you have questions!

mrinal1209 commented 5 years ago

Hello , I am postgraduate student from India and this is my first time for GSOC . I am qualified web developer and contribute in PHP and Java based projects and I would love to be part of your organization for this GSOC19 as I can see you have listed some warm up task I would like you guys to know I am contributing for the same.

beyzayaman commented 5 years ago

Hi @mrinal1209 and @maykillmore . I suggest that as far as you have some questions contact with me so that you can improve faster. IF you try some of the warm up tasks and share with us (like found errors) it would be good to see it.

mrinal1209 commented 5 years ago

@beyzayaman what sort of technology can we use for this project ?

beyzayaman commented 5 years ago

@mrinal1209 It is up to you but triple checkmate is in Java so it may be useful to proceed with that one. What was your idea?

mrinal1209 commented 5 years ago

@beyzayaman I was luckily thinking Java too . Well I am on reading the papers provided above.

beyzayaman commented 5 years ago

Do you guys have any background on OWL, RDF technologies? Also Please take into account that you need to write a proposal by the 9th of April so it would be a good idea to keep the reading period as short as possible and try to write some ideas so that we can improve together.

mrinal1209 commented 5 years ago

@beyzayaman I have found a small bug on the DBPedia website itself I have attached a POC link for the same :- https://drive.google.com/open?id=1eQJazwOWgdtWQ7RUmmmV4Xx1tO_GdHBX as well can you tell is there any bugs portal like (JIRA) where we can start contributing ?

mrinal1209 commented 5 years ago

@beyzayaman I dont have any background in OWL and RDF technologies but I assure that this summer I dont have any commitments and can learn while working as quick as possible

mrinal1209 commented 5 years ago

@beyzayaman @maykillmore guys I am using this blog to understand owl and rdf :-

http://w3schools.sinsixx.com/rdf/rdf_owl.asp.htm

mrinal1209 commented 5 years ago

@beyzayaman Hi I am thinking to start making project proposal for this project Things I have done till now are reading paper Quality Assessment Methodologies for Linked Open Data as well gone through the code base Triple Check Mate tool , also did some handson tutorials on RDF with java . Can you suggest something else can I do to make a good GSOC proposal .

beyzayaman commented 5 years ago

Hi! @mommi84 shared a successful project proposal, you can check that: http://tommaso-soru.it/files/misc/Akshay-DBpedia-GSoC-2017-proposal.pdf In the meantime please share your ideas. I guess quality issue must have been clear now and you can share what you have found on data as well. You can create documents and share it with me in the mail given on Mentors page so I can review it.