snorkel-team / snorkel

A system for quickly generating training data with weak supervision
https://snorkel.org
Apache License 2.0
5.81k stars 857 forks source link

Issue #1602 - Add get_label_instances to Analysis #1608

Closed DavidKoleczek closed 4 years ago

DavidKoleczek commented 4 years ago

Description of proposed changes

Implements changes as discussed @bhancock8 in #1602. Added a method to analysis/error_analysis that wraps get_label_buckets functionality. Given a bucket, a NumPy array x of your data, and corresponding y label(s), it will return to you x with only the instances corresponding to that bucket.

Let me know if there are any issues, questions, or suggestions. Thanks!

Related issue(s)

Issue #1602

Checklist

Need help on these? Just ask!

codecov[bot] commented 4 years ago

Codecov Report

Merging #1608 into master will increase coverage by 0.01%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1608      +/-   ##
==========================================
+ Coverage   97.19%   97.21%   +0.01%     
==========================================
  Files          68       68              
  Lines        2137     2151      +14     
  Branches      343      345       +2     
==========================================
+ Hits         2077     2091      +14     
  Misses         31       31              
  Partials       29       29              
Impacted Files Coverage Δ
snorkel/analysis/__init__.py 100.00% <100.00%> (ø)
snorkel/analysis/error_analysis.py 100.00% <100.00%> (ø)
DavidKoleczek commented 4 years ago

Thanks for the great comments @bhancock8 ! I made the requested changes, let me know if there is anything else.