It's going to be useful to have items in detector output lists align with detector inputs; then we have a better audit trail.
There are some tensions in this change:
Detector test for use of all_outputs (introduced in #644) prefers detector output to be similar length to detector "input" (attempt.all_outputs)
Test for detector output wants a list of floats
Detectors can't always give a hit/miss result; sometimes a test can't be performed, e.g. if a file is missing, and any negative/positive given may be false. So it's helpful to convey "no result"
We want detector input & output to align
It's not clear how hitloging gets the right item
Proposal:
Relax constraint 2 above and permit detector output lists to bear NoneType
Tighten constraint 1 above so that lists exactly match
Update detectors to return None if applicable (e.g. an ONNX scanner might yield 1.0 if suspicious content is found, 0.0 if no suspicious content is found, and None if the file isn't an ONNX file)
Test for 4
Check 5
Test that evaluators do accept None - this appears to be already running. The desired result is to report a score based on the subset of the detector return list that was a valid test
Log the number of invalid/absent tests, signified by detector returning None, in eval objects in the report JSONL
It's going to be useful to have items in detector output lists align with detector inputs; then we have a better audit trail.
There are some tensions in this change:
attempt.all_outputs
)hitlog
ing gets the right itemProposal:
NoneType
None
if applicable (e.g. an ONNX scanner might yield1.0
if suspicious content is found,0.0
if no suspicious content is found, andNone
if the file isn't an ONNX file)evaluators
do acceptNone
- this appears to be already running. The desired result is to report a score based on the subset of the detector return list that was a valid testNone
, ineval
objects in the report JSONL