adbar / py3langid

Faster, modernized fork of the language identification tool langid.py
https://adrien.barbaresi.eu/blog/language-detection-langid-py-faster.html
Other
48 stars 8 forks source link

Sourcery refactored master branch #4

Closed sourcery-ai[bot] closed 2 years ago

sourcery-ai[bot] commented 2 years ago

Branch master refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the master branch, then run:

git fetch origin sourcery/master
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

sourcery-ai[bot] commented 2 years ago

Sourcery Code Quality Report

❌  Merging this PR will decrease code quality in the affected files by 0.14%.

Quality metrics Before After Change
Complexity 9.63 🙂 9.56 🙂 -0.07 👍
Method Length 55.38 ⭐ 55.17 ⭐ -0.21 👍
Working memory 10.41 😞 10.50 😞 0.09 👎
Quality 60.99% 🙂 60.85% 🙂 -0.14% 👎
Other metrics Before After Change
Lines 1178 1183 5
Changed files Quality Before Quality After Quality Change
setup.py 92.16% ⭐ 92.24% ⭐ 0.08% 👍
py3langid/examples/_twokenize.py 55.85% 🙂 55.45% 🙂 -0.40% 👎
py3langid/tools/printfeats.py 94.84% ⭐ 96.01% ⭐ 1.17% 👍
py3langid/train/IGweight.py 47.50% 😞 47.52% 😞 0.02% 👍
py3langid/train/NBtrain.py 61.60% 🙂 61.56% 🙂 -0.04% 👎
py3langid/train/common.py 80.33% ⭐ 81.22% ⭐ 0.89% 👍
py3langid/train/index.py 63.10% 🙂 63.28% 🙂 0.18% 👍
py3langid/train/tokenize.py 54.15% 🙂 54.13% 🙂 -0.02% 👎

Here are some functions in these files that still need a tune-up:

File Function Complexity Length Working Memory Quality Recommendation
py3langid/train/tokenize.py pass_tokenize 39 ⛔ 315 ⛔ 12 😞 21.64% ⛔ Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
py3langid/train/IGweight.py pass_IG 22 😞 305 ⛔ 13 😞 28.57% 😞 Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
py3langid/train/NBtrain.py pass_tokenize 25 😞 192 😞 11 😞 35.96% 😞 Refactor to reduce nesting. Try splitting into smaller methods. Extract out complex expressions
py3langid/train/index.py CorpusIndexer.__init__ 10 🙂 146 😞 12 😞 50.01% 🙂 Try splitting into smaller methods. Extract out complex expressions
py3langid/train/index.py CorpusIndexer.prune_min_domain 9 🙂 104 🙂 11 😞 57.78% 🙂 Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.


Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!