TutteInstitute / vectorizers

Vectorizers for a range of different data types
BSD 3-Clause "New" or "Revised" License
93 stars 23 forks source link

added document_context to TokenCooccurrenceVectorizer() #61

Closed jc-healy closed 3 years ago

jc-healy commented 3 years ago

This allows contexts to range over documents. This is quite useful when dealing with a sequence of multilabelled tokens. We treat a multilabelled token as a document and build our contexts across those.

codecov-commenter commented 3 years ago

Codecov Report

Merging #61 (54d0558) into master (3ee4f14) will decrease coverage by 1.02%. The diff coverage is 24.70%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #61      +/-   ##
==========================================
- Coverage   65.29%   64.27%   -1.03%     
==========================================
  Files          19       19              
  Lines        2899     2981      +82     
==========================================
+ Hits         1893     1916      +23     
- Misses       1006     1065      +59     
Impacted Files Coverage Δ
vectorizers/utils.py 49.68% <ø> (ø)
vectorizers/token_cooccurrence_vectorizer.py 56.31% <4.47%> (-7.62%) :arrow_down:
vectorizers/_window_kernels.py 26.08% <100.00%> (+1.08%) :arrow_up:
vectorizers/tests/test_common.py 99.78% <100.00%> (+<0.01%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3ee4f14...54d0558. Read the comment docs.