DevopediaOrg / question-generator

Generate a list of questions that beginners might ask on a technical topic
MIT License
1 stars 4 forks source link
hacktoberfest2019

Overview

Authors write articles on tech topics on Devopedia. In the Discussion section, they manually identify questions that beginners are most likely to ask to understand the topic. The goal of this project is to have an algorithm to automatically identify such questions.

The algorithm could consider the following sources:

Questions suggested by Google Search

The above figure shows what Google suggests when we search for the topic "Data Preparation". The first three questions are extremely relevant. The fourth question has the same intent as the first one and therefore such duplicates must be ignored. Another relevant question that Google suggests is "How is data cleaning done?" This is about giving more details from an earlier question about the data preparation process. Another question "What are data vizualization tools?" is related to data preparation but not relevant. Hence, such a suggestion must be ignored.

Research tips shared on Devopedia's Author Guidelines page might help.

When trying to get information from various sources, prefer to use APIs instead of web scraping.

Deliverables

Project must be implemented in Python3 with a modular design. Provide basic documentation and examples. No user interface is expected. Selected questions can be simply display on the console.

Code should support the following:

Approach

The present Algorithm is based on attribution of a Users reputation on a source (say, Stackoverflow) and translating it to the reputation score of the Question. We then use some other signals derived from the same source (or a combination of sources) to categorize the questions into Beginner, Intermediate and Expert level questions and rank the questions within each of the categories.