quanteda / quanteda.textstats

Textual statistics for quanteda
GNU General Public License v3.0
14 stars 2 forks source link

Update proxy c #45

Closed koheiw closed 2 years ago

koheiw commented 2 years ago

Use the use_nan argument that was added to proxyC recently. It's functions return the NaN for the empty or uniform vectors in the same way as proxy::simil() or stats::dist(), but it is only when measures are correlation or cosine.

proxyC returns zero when measures are based on set-theory or vector space model, because common elements of empty sets are zero and distance between origins are zero. Cosine is also a spatial measure but proxyC::simil() still returns NaN for empty vectors because its is about angles, not coordinates.

codecov[bot] commented 2 years ago

Codecov Report

Merging #45 (745af45) into master (6a89b78) will decrease coverage by 0.02%. The diff coverage is 100.00%.

:exclamation: Current head 745af45 differs from pull request most recent head 8dcdc37. Consider uploading reports for the commit 8dcdc37 to get more accurate results Impacted file tree graph

@@            Coverage Diff             @@
##           master      #45      +/-   ##
==========================================
- Coverage   82.22%   82.20%   -0.03%     
==========================================
  Files          16       16              
  Lines        1176     1152      -24     
==========================================
- Hits          967      947      -20     
+ Misses        209      205       -4     
Impacted Files Coverage Δ
R/textstat_simil.R 97.80% <100.00%> (+1.68%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 6a89b78...8dcdc37. Read the comment docs.