Resume Parsing Accuracy

Yashdew commented 2 years ago

Problem Statement Resume PDF file should be parsed in such a way that the algorithm can easily differentiate for a different entity like name, phone numbers, email, links, Education background, Experience background, skills, hobbies, achievements, projects and certification etc.

Steps to follow

You can use different types of packages like Spacy.io Documentation.
You can also use keywords methods.
or Design a different type of algorithms that do the entity recognition(mentioned above).

for reference purposes, you can test the backend on postman here: LINK Just do a POST request.

How to contribute

check out the api/ directory of repo and follow the docs

Final form should look like this

[
    {
        "personal_details": {
            "name": "Yash Dewangan",
            "email": "yashdewangan123456@gmail.com",
            "mobile_number": "8602842290"
        },
        "skills": [
            "Pandas",
            "Coding",
            "C",
            "Flask",
            "Css",
            "Java",
            "C++",
            "Django",
            "Rest",
        ],
        "education": [
            "SMT. KASHIBAI NAVALE COLLEGE OF ENGINEERING
            BE in Information Technology
            2018-2022 | Pune, MH
            Cum. GPA: 8.14",
        ],
        "experience": [
            "eQ Technologic | Software Engineer Intern
            Aug 2021 – Present
            Implemented various services/APIs needed for new features required in the latest release
            Learnt about SOA architecture, modular coding i.e. keeping future use in mind
            Implementation of concepts such as Tagging Entities and  Groups/User Authorization & Permissions for Entities
            Worked on Backend technologies such as Spring and Java with SQL Server as Database"
        ],
        "no_of_pages": 1,
        "links": {
            "linkedin": "https://www.linkedin.com/in/iyashdewangan/",
            "leetcode": "https://leetcode.com/Yashdew/",
            "codechef": "https://www.codechef.com/users/yashdew",
            "codeforces": "http://codeforces.com/profile/yashdewangan123456",
            "github": [
                "https://github.com/Yashdew/Attendance-Tracker",
                "https://github.com/Yashdew",
                "https://github.com/SkSumit/Chatistics"
            ],
            "others": [
                "mailto:yashdewangan123456@gmail.com",
                "https://www.spoj.com/users/yashdew/",
                "https://attendancesknhc.herokuapp.com/",
                "https://chatistics.vercel.app/",
                "https://auth.geeksforgeeks.org/user/yashdewangan123456/practice/"
            ]
        },
        "total_experience": 0.17,
        "projects": [
            "CHATISTICS
            GitHub Live URL
            Dec 2020 - Feb 2021
            An open-source WhatsApp chats analyser and statistics.
            Application, which provides various meaningful insights.
            Time complexity reduces from 20 seconds. to 5 seconds.
            Used Flask for implementing backend REST APIs with firebase database for analysis of traffic.
            Pandas for data pre-processing.
            Used NextJS and Bulma UI for frontend.
            500+ users and 30 stars on GitHub.",

            "ATTENDANCE-TRACKER
            GitHub Live URL
            July 2020 – Aug 2020
            A full-stack web application for monitoring the attendance in Microsoft Teams from logs file of the meeting. (Sample)
            Optimization of code took around 3 seconds in Data pre-processing.
            Worked on building the major backend part and frontend.
            Used Flask for implementing Backend and HTML, CSS & JS for frontend.
            Used Mongo DB and Google sheet API for Database.
            Data pre-processing of large logs files for calculating time stamps of students using pandas
            50+ users in our college."
        ],
        "achievements": [
            "Codechef - Maximum rating 1603 (3-star).",
            "Codechef – March Lunchtime 2021 Div-3, secured a rank of 825 out of 7000+ participants.",
            "Leetcode – 150+ Solved Questions.",
            "250+ Solved Questions on GFG, Codechef, SPOJ and Codeforces.",
            "Participated in Google kickstart 2021 Round A, Round C & Round D.",
            "Secured 1st rank out of 30+ participants in Scaler Edge Apex 2021. (SKN Edition)",
            "Represented Hack Club SKN projects in Hack Club Asia Summit 2021.",
            "Participated in more than 30+ coding competitions."
        ],
        "hobbies": [
            "Photography and Video editing",
            "Traveling and exploring new places.",
            "Gaming"
        ]
    }
]

deathsurgeon1 commented 2 years ago

@Yashdew I would like to work on this. Just wanted to know if u have some resume data for which the parsing accuracy is not meeting the expectations.

Yashdew commented 2 years ago

@deathsurgeon1 Here is the folder for testing.

deathsurgeon1 commented 2 years ago

@Yashdew I am working on improving the accuracy of the parser. I have got some improvement points . will be updating in some time.

Yashdew commented 2 years ago

@deathsurgeon1 sure thing man!

Yashdew / Assessor

Resume Parsing Accuracy #32