neomatrix369 / nlp_profiler

A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.

Other

241 stars 37 forks source link

Renaming column words_count to count_words #27

Closed neomatrix369 closed 3 years ago

neomatrix369 commented 3 years ago

Renaming column: on the back of @jammy-bot's changes, redoing the changes covering code, data and notebooks

Thanks @jammy-bot for raising this and your initial PRs #18 and #23 (plus the issue #24) - I have applied the changes for you, have a look at the commits on how it is done.

Feel free to continue with #24 and anything else you fancy in the https://github.com/neomatrix369/nlp_profiler/issues section

codecov-io commented 3 years ago

Codecov Report

Merging #27 into master will not change coverage. The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master       #27   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           20        20           
  Lines          358       358           
  Branches        51        51           
=========================================
  Hits           358       358

Impacted Files	Coverage Δ
nlp_profiler/words.py	`100.00% <ø> (ø)`
nlp_profiler/constants.py	`100.00% <100.00%> (ø)`
nlp_profiler/granular_features.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 21c51bd...aebfa65. Read the comment docs.

sourcery-ai[bot] commented 3 years ago

Sourcery Code Quality Report

Merging this PR leaves code quality unchanged.

Quality metrics	Before	After	Change
Complexity	0.33 ⭐	0.33 ⭐	0.00
Method Length	43.50 ⭐	43.50 ⭐	0.00
Working memory	7.83 🙂	7.83 🙂	0.00
Quality	87.41% ⭐	87.41% ⭐	0.00%

Other metrics	Before	After	Change
Lines	105	105	0

Changed files	Quality Before	Quality After	Quality Change
nlp_profiler/constants.py	85.66% ⭐	85.66% ⭐	0.00%
nlp_profiler/granular_features.py	74.39% 🙂	74.39% 🙂	0.00%
nlp_profiler/words.py	96.67% ⭐	96.67% ⭐	0.00%

Here are some functions in these files that still need a tune-up:

File	Function	Complexity	Length	Working Memory	Quality	Recommendation
nlp_profiler/granular_features.py	apply_granular_features	0	70 🙂	31 ⛔	57.70% 🙂	Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

⭐ excellent
🙂 good
😞 poor
⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.

Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Let us know what you think of it by mentioning @sourcery-ai in a comment.

neomatrix369 commented 3 years ago

@sourcery-ai great report https://github.com/neomatrix369/nlp_profiler/pull/27#issuecomment-709986856, this will keep me and my community on our toes

Thanks for including working memory in the report, it's one of the things I have at the back of my mind. So far NFR have not been given the importance I would have love to give.

Suggest add a feature to enlist the different issues that can be created from the report (show these suggestions in the form of checklist, also supply priority information to the list) - happy to discuss this further.