cytomining / cytominer-eval

[Deprecated] Common Evaluation Metrics for DataFrames
BSD 3-Clause "New" or "Revised" License
7 stars 11 forks source link

Add Hit@k functionality #61

Closed michaelbornholdt closed 2 years ago

michaelbornholdt commented 3 years ago

I am adding hitk as a function of eval. For now it will only calculate the hit list. Any other forms of displaying the data will be handled in a different PR

codecov-commenter commented 3 years ago

Codecov Report

Merging #61 (55bbf3e) into master (779cea1) will decrease coverage by 0.08%. The diff coverage is 98.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #61      +/-   ##
==========================================
- Coverage   98.95%   98.87%   -0.09%     
==========================================
  Files          30       33       +3     
  Lines         956     1065     +109     
==========================================
+ Hits          946     1053     +107     
- Misses         10       12       +2     
Flag Coverage Δ
unittests 98.87% <98.18%> (-0.09%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...r_eval/tests/test_utils/test_availability_utils.py 100.00% <ø> (ø)
cytominer_eval/utils/availability_utils.py 100.00% <ø> (ø)
cytominer_eval/operations/hitk.py 95.23% <95.23%> (ø)
cytominer_eval/utils/hitk_utils.py 95.45% <95.45%> (ø)
cytominer_eval/evaluate.py 100.00% <100.00%> (ø)
cytominer_eval/operations/__init__.py 100.00% <100.00%> (ø)
cytominer_eval/tests/test_evaluate.py 100.00% <100.00%> (ø)
cytominer_eval/tests/test_operations/test_hitk.py 100.00% <100.00%> (ø)
cytominer_eval/utils/transform_utils.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 779cea1...55bbf3e. Read the comment docs.

michaelbornholdt commented 2 years ago

@gwaygenomics @niranjchandrasekara

This PR should be done now. I added the hitk funcitonality

It outputs the index list which can be used for plotting. It also outputs the accumulated scores and certain percentages. I have not figure out how to explain this well :D

michaelbornholdt commented 2 years ago

@gwaygenomics
Im signing off until Wednesday. So take your time :)

There are still some things to work on, especially on tests and on allowing for certain NaN data. however, I would like to push this before the end of next week since I need it for my project.

Apart from that, I hope the new documentation makes things clearer and also modular in some sense at least

michaelbornholdt commented 2 years ago

Now its as modular as it can be

michaelbornholdt commented 2 years ago

@gwaygenomics
I don't know if its clear what I changed but this is as far as I will go. I will open an issue to report the further todos for next month after the thesis.

gwaybio commented 2 years ago

Thanks @michaelbornholdt - I have already approved the PR :) so I am ready to merge

Just looks like you'll need to fix the breaking test:

cytominer_eval/operations/hitk.py:7: in <module>
    from cytominer_eval.utils.hitk_utils import add_hit_rank, percentage_scores
E     File "/home/runner/work/cytominer-eval/cytominer-eval/cytominer_eval/utils/hitk_utils.py", line 73
E       f"The percent score at 100% is {d[p]}, it should be 0 tho. Check your groupby_columns"
E                                                                                            ^
E   SyntaxError: invalid syntax
gwaybio commented 2 years ago

merged!