neomatrix369 / nlp_profiler

A simple NLP library allows profiling datasets with one or more text columns. When given a dataset and a column name containing text data, NLP Profiler will return either high-level insights or low-level/granular statistical information about the text in that column.
Other
243 stars 37 forks source link

Revert "Spelling checker has been modified" #75

Closed neomatrix369 closed 1 year ago

neomatrix369 commented 1 year ago

Reverts neomatrix369/nlp_profiler#71

Reverting spell check to previous spell check as the new one introduced does not catch the previously incorrectly spelt words. Need to review further before re-appling this check.

sourcery-ai[bot] commented 1 year ago

Sourcery Code Quality Report

βœ…  Merging this PR will increase code quality in the affected files by 1.12%.

Quality metrics Before After Change
Complexity 2.25 ⭐ 1.94 ⭐ -0.31 πŸ‘
Method Length 43.77 ⭐ 41.15 ⭐ -2.62 πŸ‘
Working memory 5.31 ⭐ 5.17 ⭐ -0.14 πŸ‘
Quality 85.51% ⭐ 86.63% ⭐ 1.12% πŸ‘
Other metrics Before After Change
Lines 337 320 -17
Changed files Quality Before Quality After Quality Change
setup.py 67.46% πŸ™‚ 67.46% πŸ™‚ 0.00%
nlp_profiler/high_level_features/ease_of_reading_check.py 85.18% ⭐ 85.73% ⭐ 0.55% πŸ‘
nlp_profiler/high_level_features/grammar_quality_check.py 90.16% ⭐ 90.11% ⭐ -0.05% πŸ‘Ž
nlp_profiler/high_level_features/spelling_quality_check.py 84.28% ⭐ 87.36% ⭐ 3.08% πŸ‘
tests/granular/test_nounphrase.py % % %
tests/granular/test_syllables.py 90.40% ⭐ 90.40% ⭐ 0.00%
tests/high_level/test_ease_of_reading_check.py 87.93% ⭐ 87.93% ⭐ 0.00%
tests/high_level/test_grammar_check.py 87.91% ⭐ 87.91% ⭐ 0.00%

Here are some functions in these files that still need a tune-up:

File Function Complexity Length Working Memory Quality Recommendation

Legend and Explanation

The emojis denote the absolute quality of the code:

The πŸ‘ and πŸ‘Ž indicate whether the quality has improved or gotten worse with this pull request.


Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (dde3172) 100.00% compared to head (2cddf51) 100.00%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #75 +/- ## ========================================= Coverage 100.00% 100.00% ========================================= Files 26 26 Lines 498 498 Branches 74 74 ========================================= Hits 498 498 ``` | [Impacted Files](https://codecov.io/gh/neomatrix369/nlp_profiler/pull/75?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None) | Coverage Ξ” | | |---|---|---| | [...filer/high\_level\_features/grammar\_quality\_check.py](https://codecov.io/gh/neomatrix369/nlp_profiler/pull/75?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None#diff-bmxwX3Byb2ZpbGVyL2hpZ2hfbGV2ZWxfZmVhdHVyZXMvZ3JhbW1hcl9xdWFsaXR5X2NoZWNrLnB5) | `100.00% <100.00%> (ΓΈ)` | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=None)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

neomatrix369 commented 1 year ago

Merging for now, will review windows report writing error caused due to unicode encoding via a separate issue