OmkarPathak / pyresparser

A simple resume parser used for extracting information from resumes
GNU General Public License v3.0
773 stars 394 forks source link

Add string parsing #59

Open mobiuscreek opened 2 years ago

mobiuscreek commented 2 years ago

Summary

Thank you for the very nice library @OmkarPathak . This PR adds the minimum changes to make string parsing work for pyresparse. For example one can now do the following :

from pyresparser import ResumeParser

data = ResumeParser('random_string').get_extracted_data()

However, given that the string has to be properly formatted (see the test here) I don't know how useful that is in its current form (like the use mentioned in #45). In my case, I already had an excel file that I imported to a dataframe which I used to read the strings row by row. I then parsed them with pyresparse to extract what I needed from each paragraph (each row in my case).

Even though the function added just returns the input, the test written makes sure it works as intended in case it needs any additional changes. Lastly, the test_name.py file was changed to test_pyresparser.py and moved to a test folder. The test_local_name.py test was failing due to a problem with the format of the pdf file. I changed it to test the extracted skills instead of the name. I hope that makes sense.