Notes - Githubissues

Amitpatil215 commented 2 years ago

1.github repo 2.Project Synopsis

Graphical Analysis/mining
Presentation (10 min slides)

Mid - eval

Data Cleaning 1.1 Remove replace null values in following columns
Age,salery,Marrid/Unmarried, Gender,skills => Build a credibility score out of 10

++ how gender, skills, dependency,dev_type, salery related to each other -> graphical represntaion 2.1 Highest paid skills & highest salery group by skill 2.2 Highest paid dev_type 2.3 Age with salery 2.4 Find salery wise job_satisfaction group by skills

For Final Eval

Build homogenous graph's as in resume-skills, resume-location, resume-dev_type(backend/frontend), after that take most popular nodes and build a heterogenous knowledge graph

Amitpatil215 commented 2 years ago

Possible Questioner

What is the problem statement?
What you have did so far and what is the plan after mid evaluation?
What you did in minor I ?
Why knowledge Graph?
Why Neo4j?
From where did you got your dataset?
How you going to recommend job and job description?
What all research paper you have read & show them what is conclusion?

sanjoli63 commented 2 years ago

we are going to leverage a knowledge graph-based recommendation system that helps candidates to find jobs according to their skillsets.
We analyse various aspects which help to recommend job and job descriptions based on location, age group, etc. Future - Build homogenous graph's as in resume-skills, resume-location, resume-dev_type(backend/frontend), after that take the most popular nodes and build a heterogeneous knowledge graph
we research and learn about the basics of knowledge graphs. we built a knowledge graph taking resumes, jds and skills as a node, experience as a weight using networkx.
A knowledge graph is self-descriptive, as it provides a single place to find the data and understand what it is all about. Knowledge graphs are being used for a wide range of applications from space, journalism, biomedicine to entertainment, network security, and pharmaceuticals.
Neo4j delivers the lightning-fast read and write performance you need, while still protecting your data integrity.Neo4j graph algorithms are scalable and production-ready. Neo4j algorithms are written in Java and performance tested. NetworkX is a single node implementation of a graph written in Python. The response time is much faster in Neo4j
Resume Dataset - Stack Overflow Developer Survey Analysis from Kaggle, JD - US jobs on Dice.com from the data world

Amitpatil215 commented 2 years ago

Q. What is cold start problem?

Challenges with Collaborative Filtering The only issue with this method is that the prediction of the model for a given user, item pair is the dot product of the corresponding embeddings. So, if an item is not seen during training, the system cannot generally create an embedding for it and hence cannot query the model with this item. This issue is known as the cold-start problem.

Collaborative Filtering depends on historical preference on a set of items to recommend from, and because it is based on historical data, the core assumption made is that the users who have agreed in the past will also tend to agree in the future.

(i.e. if user doing a particular job in particular skills then we assume that he gonna be doing further jobs in that skill only..but that might not be the case)

Q. What is link analysis technique?

Link analysis is an analysis technique that focuses on relationships and connections in a dataset. Link analysis gives you the ability to calculate centrality measures—namely degree, betweenness, closeness, and eigenvector—and see the connections on a link chart or link map.

Amitpatil215 commented 2 years ago

Starting Intro ->> (Amit) Amit start problem satement extending minor I to Minor II What we did before? we research and learn about the basics of knowledge graphs. we built a knowledge graph taking resumes, jds and skills as a node, experience as a weight using networkx. What we doing now? (till now and for end eval) clean & processed data, credibility score, Scalibility Neo4j with cipher query language Amit End

Sanjoli Start CF,KG,Neo4j Sanjoli End

Muskan Start dataset,Nodes,Edges transitivty property b/w resume and JD

muskan - starting toDevType, Age, Operating System,location Sanjoli - gender to skill dependents,salary Amit: neo4j

Data Cleaning : DevType Data Cleaning : Age Data Cleaning : Operating System Data Cleaning : Location Data Cleaning : Gender Data Cleaning : Skills Data Cleaning: Dependents Data Cleaning: Salary Build Graph on Neo4j

Mystic-Trooper / job_recomendation_KG

Notes #1

Possible Questioner

Q. What is cold start problem?

Q. What is link analysis technique?