TutteInstitute / vectorizers

Vectorizers for a range of different data types
BSD 3-Clause "New" or "Revised" License
97 stars 23 forks source link

The big cooccurrence refactor #99

Closed cjweir closed 2 years ago

cjweir commented 2 years ago

The big refactoring is basically complete. The only draw back is that your kernel functions (if you are doing multiple window sizes) need to all be the same. This is a numba list of functions issue. It will take some work to add this back in I think... it can be another PR.

codecov-commenter commented 2 years ago

Codecov Report

Merging #99 (c2f43a7) into master (3bb7902) will decrease coverage by 0.05%. The diff coverage is 89.45%.

@@            Coverage Diff             @@
##           master      #99      +/-   ##
==========================================
- Coverage   90.55%   90.50%   -0.06%     
==========================================
  Files          28       32       +4     
  Lines        4491     4644     +153     
==========================================
+ Hits         4067     4203     +136     
- Misses        424      441      +17     
Impacted Files Coverage Δ
vectorizers/edge_list_vectorizer.py 77.27% <ø> (-0.34%) :arrow_down:
vectorizers/skip_gram_vectorizer.py 90.99% <ø> (-0.09%) :arrow_down:
vectorizers/preprocessing.py 80.28% <58.18%> (-9.26%) :arrow_down:
vectorizers/_window_kernels.py 75.74% <69.56%> (-8.20%) :arrow_down:
vectorizers/utils.py 77.10% <71.42%> (+11.11%) :arrow_up:
vectorizers/ngram_vectorizer.py 90.90% <84.84%> (+4.35%) :arrow_up:
vectorizers/base_cooccurrence_vectorizer.py 88.19% <88.19%> (ø)
vectorizers/ngram_token_cooccurence_vectorizer.py 92.78% <92.78%> (ø)
vectorizers/coo_utils.py 93.22% <93.33%> (+1.00%) :arrow_up:
vectorizers/__init__.py 100.00% <100.00%> (ø)
... and 10 more

:mega: Codecov can now indicate which changes are the most critical in Pull Requests. Learn more