Technocolabs100 / Stack-Overflow-Tag-Predictions

Tag Prediction from Stack Overflow Questions
10 stars 11 forks source link

Analysis of Tags #14

Open Technocolabs100 opened 3 years ago

Technocolabs100 commented 3 years ago

Tags are our class labels. As we were trying to predict them, we should deep dive and understand them very well. After removing all the duplicated data we are left with 4.2 Million data points and 42k unique tags. The number of times a tag appeared is an interesting thing to understand. So I just counted it and put it into a dictionary. If we observe the table below, the “.a” tag appeared in 18 questions, the “.app” tag appeared in 37 questions, and so on. Remember, we will never have a tag repeating two times in the same question.

Abhisheka394 commented 3 years ago

Can you please assign this issue to me. I'm a GSSOC participant.

Technocolabs100 commented 3 years ago

hello Abhishek, I will assign this issue to you! Thanks

Technocolabs100 commented 3 years ago

Hello Abhishek, Please send me the nbviewer link of the first task analysis of tags. It will help review the code before merging

Thanks

Abhisheka394 commented 3 years ago

nbviewer link: https://nbviewer.jupyter.org/github/Abhisheka394/Stack-Overflow-Tag-Predictions/blob/analysis_of_tags/analysis_of_tags.ipynb

Abhisheka394 commented 3 years ago

@Technocolabs100 Any update on the status of my PR

Abhisheka394 commented 3 years ago

@Technocolabs100 It has been more than a week, still no response.