UBC-MDS / nlpsummarize

Python package that provides a nice summary of all character columns in a pandas dataframe.
MIT License
0 stars 1 forks source link

Combining functions into one class #55

Closed KarlosMuradyan closed 4 years ago

KarlosMuradyan commented 4 years ago

The separate files still exist, but will be removed in later commits.

There I implemented the class with all functions and modified the test functions to handle this change. The class has all functionality that has pd.DataFrame as well the 4 functions that we wrote. When defining the class we can give column name that we are interested in, or the class will automatically detect the column with text and use it for getting summaries. All functions implement optional column argument so that we can generate summary for that specific column. Unfortunately, for now each function can only give summary for one column. I wrote get_nlp_summary() function that gets the output of all functions and returns a single dataframe concatenating all outputs together. The class is implemented in nlp.py file, that has also read_csv() and read_excel() functions similar to pandas. Now you can write:

from nlpsummarize import nlp
df = nlp.read_csv(path_to_csv)
type(df)
[0] nlpsummarize.nlp.NLPFrame

You can give any argument to that functions that pd.read_csv or pd.read_excel implement, so it is completely scalable.