neomatrix369 / nlp_profiler

A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.

Other

243 stars 37 forks source link

Sourcery refactored master branch #36

Closed sourcery-ai[bot] closed 4 years ago

sourcery-ai[bot] commented 4 years ago

Branch master refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the master branch, then run:

git fetch origin sourcery/master
git merge --ff-only FETCH_HEAD
git reset HEAD^

sourcery-ai[bot] commented 4 years ago

Sourcery Code Quality Report

✅ Merging this PR will increase code quality in the affected files by 0.43%.

Quality metrics	Before	After	Change
Complexity	1.07 ⭐	1.07 ⭐	0.00
Method Length	23.27 ⭐	22.73 ⭐	-0.54 👍
Working memory	4.93 ⭐	4.80 ⭐	-0.13 👍
Quality	91.03% ⭐	91.46% ⭐	0.43% 👍

Other metrics	Before	After	Change
Lines	97	95	-2

Changed files	Quality Before	Quality After	Quality Change
nlp_profiler/sentiment_polarity.py	85.57% ⭐	85.63% ⭐	0.06% 👍
nlp_profiler/granular_features/numbers.py	95.63% ⭐	96.67% ⭐	1.04% 👍
nlp_profiler/granular_features/stop_words.py	91.41% ⭐	92.43% ⭐	1.02% 👍
slow-tests/performance_tests/common_functions.py	94.11% ⭐	94.11% ⭐	0.00%

Here are some functions in these files that still need a tune-up:

File	Function	Complexity	Length	Working Memory	Quality	Recommendation

Legend and Explanation

The emojis denote the absolute quality of the code:

⭐ excellent
🙂 good
😞 poor
⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.

Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Let us know what you think of it by mentioning @sourcery-ai in a comment.

neomatrix369 commented 4 years ago

@soucery-ai please take a look at this comment https://github.com/neomatrix369/nlp_profiler/pull/36#issuecomment-710039204, the formatting could be improved here - empty table can be removed if no recommendations exist

codecov-io commented 4 years ago

Codecov Report

Merging #36 into master will not change coverage. The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master       #36   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           21        21           
  Lines          360       358    -2     
  Branches        51        51           
=========================================
- Hits           360       358    -2

Impacted Files	Coverage Δ
nlp_profiler/granular_features/numbers.py	`100.00% <100.00%> (ø)`
nlp_profiler/granular_features/stop_words.py	`100.00% <100.00%> (ø)`
nlp_profiler/sentiment_polarity.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 41cd9eb...738d3e8. Read the comment docs.

neomatrix369 commented 4 years ago

@sourcery-ai I think when you are mention working memory in your reports they are not generated via memory profiling I'm guessing? It might help to mention that in the report. Because now it seems to have gone away but I know many aspects of this library do not run at optimum memory usage and it's one of the things I'm going to be fixing in the near future.