bjherger / ResumeParser

A framework to parse resumes, extract contact & other information, and check for required terms
367 stars 216 forks source link

Can you help me able to fetch the name separately? #8

Closed Shrinidhikulkarni7 closed 7 years ago

Shrinidhikulkarni7 commented 7 years ago

Hi, amazing job, but it will be helpful if i can fetch the name from the resume.

bjherger commented 7 years ago

@Shrinidhikulkarni7 : I'm not sure that I understand. Which name? The applicant name?

Shrinidhikulkarni7 commented 7 years ago

@bjherger Yes the applicant name.

iHirenDev commented 7 years ago

@Shrinidhikulkarni7 nltk is the library which can be used for extracting the human names. Below is the code for that:

import nltk from nltk.corpus import stopwords stop = stopwords.words('english')

def extract_names(document): names = [] sentences = ie_preprocess(document) for tagged_sentence in sentences: for chunk in nltk.ne_chunk(tagged_sentence): if type(chunk) == nltk.tree.Tree: if chunk.label() == 'PERSON': names.append(' '.join([c[0] for c in chunk])) return names

I am using this code in ResumeParser but it is not that much perfect name fetcher. I got the output like:

Names: [u'Brendan,Herger', u'Hiren,Patel,Address', u'Albert,Street', u'Sanket,Rajendra,Mantri', u'William', u'San,Franc']

Here, as you can see it has considered Address, Albert, Street as a name. I guess that is the limitation of nltk library.

I am also looking for the good solution. Hope it help you. Good luck.

bjherger commented 7 years ago

A more robust approach might be to list out people and organizations on the resume. For example, I've worked for congresswoman Gabbie Giffords, so an NER search on my resume would include Brendan Herger (me), and Gabbie Giffords (a former employer).

I'm working on an approach similar to @iHirenDev 's which uses Stanford's NER engine, and NLTK's interface for Stanford's engine.

Unfortunately, NLTK's interface is pretty clumsy, and requires managing the Stanford NER jar separately.

bjherger commented 7 years ago

@Shrinidhikulkarni7 : I've included this functionality w/ version 3.x. Please let me know if it doesn't suit your needs.