CODAIT / text-extensions-for-pandas

Natural language processing support for Pandas dataframes.
Apache License 2.0
215 stars 34 forks source link

Create cleaning util module mirroring util.py #211

Closed ZachEichen closed 2 years ago

ZachEichen commented 3 years ago

Adresses issue # 196

creates a module in text extensions that folds the functionality of util.py into the main library.

Provides functionality for the following:

review-notebook-app[bot] commented 3 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

ZachEichen commented 3 years ago

All changes Requested have been made. new util.py has been split into 3 sub-modules:

which together encomass the same functionality.

All of the CoNLL_* notebooks have been updated to use the new module.

Ray dependencies have been made to be on-demand, when ray-specific functions are called, ray is imported and then used.

A new tutorial document, Cross_check_datapoints.ipynb has been created to give an overview of the functionality, and demonstrate it on a classification task.

frreiss commented 2 years ago

@ZachEichen can you please pull the latest fixes from the master branch into your PR branch to unblock the CI tests?