dbpedia / GSoC

Google Summer of Code organization
37 stars 27 forks source link

Data Quality Dashboard for DBpedia #2

Closed mgns closed 4 years ago

mgns commented 6 years ago

Description

DBpedia offers large quantities of structured data. Though, DBpedia has partly insufficient data quality which originate from different sources, e.g. incorrect extractions and value transformations in the extraction framework, inconsistent mappings, incorrect data in Wikipedia articles, and generally incompleteness.

Goals

Visualize a set of metrics in an easy to read interactive UI that facilitates the decision on what should be fixed next in DBpedia.

Impact

The interface will help DBpedia contributors to adopt a “data quality first” attitude, enable data-driven prioritization of development tasks.

Warm up tasks

Keywords

data quality, front-end, full-stack, javascript, react js, meteor js, user interface, UI

vedularaghu commented 6 years ago

Hi, I'm an undergraduate student of CET, Bhubaneswar. I have prior experience in full stack web development (Node js , ReactJs, Javascript). I would like to work on this project. Please help in picking up issues to start with this project. Thank you.

mgns commented 6 years ago

Hi, there is a warm-up task (#11) referenced in the description. You could start with that.

PseudoNerd commented 5 years ago

I am Tanmay, I'm an undergraduate from India.

I'd like to work on this project for the coming summer of code. I had mentioned a query about the warmup-task associated with it and would like to take it up.

LakshanSS commented 5 years ago

Hi! I am Lakshan. I am interested in doing this project. I am familiar with ReactJs and JavaScript. I would like to work on this in GSOC 2019.

beyzayaman commented 5 years ago

Hi @PseudoNerd and @LakshanSS. If you try some of the warm up tasks and share with us (like found errors) it would be good to see it.

ghost commented 5 years ago

Hi, i am akata. I would like to work on this project for gsoc 2019. I will start with the error finding warmup task as mentioned by @beyzayaman .

PseudoNerd commented 5 years ago

I am currently pursuing a warm-up task and also taking a little more time with DBpedia. I'll be making a spreadsheet soon. Sorry for the delay.

bharat-suri commented 5 years ago

Hello everyone, it is good that you have started working on the warm-up tasks. However, I would like to encourage you all to write your proposals for your chosen projects. This will allow us to review and suggest changes well before the submission deadline. Thanks!

PseudoNerd commented 5 years ago

I've already made a list of certain bugs in DBpedia but feel like the list doesn't fully cover DBpedia fully. Also, I've been working on the proposal for this project. Will it be okay to submit the mock proposal and the spreadsheet on 22nd of this month? I have my exams going on and hence the delay.

bharat-suri commented 5 years ago

That's okay, we look forward to going over the list you curate. You can share the spreadsheet with the appropriate links through mail, or you can simply share the document and we will leave our comments on it. I myself created a GitHub repository and shared the link with my mentors last year.

Yes, as the submission period begins from March 25th, it will be good to review your proposal before that so you can make a successful submission as soon as possible, hopefully well before the deadline i.e. April 9th.

ghost commented 5 years ago

Hey, @tramplingWillow , @beyzayaman i am working on the proposal for this project. I want to clear something about the tool for data quality assessment for this dashboard. There are three data quality assessment tools mentioned in Quality assessment methodologies for Linked open data : 1) Flemmings data quality assessment tool. 2) Sieve. 3) LODGRefine.

For this dashboard, are we supposed to use one or all of the above mentioned tools and visualize their outputs OR we will develop a new data quality assessment tool under this project?

If we are supposed to use one of the existing tools , I could not find flemmings data quality assessment tool by google search. It will be really helpful if you could share a link to the tool.

beyzayaman commented 5 years ago

Hi @akata01. You can just use one framework. The tools there are not so up-to-date. So I would suggest you look at RDF Unit which mainly used in DBpedia and forget about the others.

bharat-suri commented 5 years ago

Hi everyone, please ensure that you submit a draft of your proposal for us to review and provide you with valuable feedback. Application period has already begun, so please plan accordingly.

ghost commented 5 years ago

Hi @mgns, @tramplingWillow, @beyzayaman! I have made the proposal for this project. Should I submit it directly through the GSoC proposal submission page or do I need to send it through a mail or any other way you suggest so that you can review it and provide me with valuable feedback?

beyzayaman commented 5 years ago

Hi @akata01. Maybe @tramplingWillow and I can do it. Can you please first check if your proposal includes all of the sections from this successful proposal? Secondly, I will ask you to share with us your proposal in google docs or any editable format so that we can make some comments on it (it can be private mails as well)?

ghost commented 5 years ago

@beyzayaman I have followed the template given here -> https://wiki.dbpedia.org/gsoc-2019. Yes it includes all the sections from the mentioned proposal. I can add you to the proposal's google sheet. Please if you could share your email.

beyzayaman commented 5 years ago

yaman@infai.org

ghost commented 5 years ago

@beyzayaman I have sent you an email with the GSoC proposal google sheet link. (I have deleted the previous comment with the link from here.) Please let me know if you didn't receive the email.

ghost commented 5 years ago

@mgns @beyzayaman @tramplingWillow I saw this(@PseudoNerd ) malicious copying my GSoC proposal document. I am uploading the screenshots. I saw him editing and when I asked him in the comments chat, he quickly copied the entire thing. I have screenshots. I am uploading them here.

gsoc)copy

copying

PseudoNerd commented 5 years ago

@akata01, First of all, I shouldn't have this link in the first place and it was public. Second, If you think your proposal was maliciously copied by me or in any way used, then the mentors can have a look at it themselves.

Also, I've been working on this proposal for a very long time and this wouldn't help my cause.

@mgns @tramplingWillow @beyzayaman , I'll get done with my proposal by tonight and you could check for anything malicious that I'm accused of.

PseudoNerd commented 5 years ago

My proposal will make anything and everything clear.