Closed samihamdan closed 8 months ago
Here's where the Extended Scorer transforms the y (always)
This is where the scorers are "wrapped" only if the extend
parameter is true:
This is where the check_scoring
passes the wrap_score
parameter as the extend
parameter to _extend_scorer
https://github.com/juaml/julearn/blob/dba30719ec47527400682bd3bbb833207b119042/julearn/scoring/available_scorers.py#L127-L160
This is where check_scoring
is called in run_cross_validation
:
Here are the two lines that set wrap_score
to True
, based on the presence of a target transformer:
https://github.com/juaml/julearn/blob/dba30719ec47527400682bd3bbb833207b119042/julearn/api.py#L251
https://github.com/juaml/julearn/blob/dba30719ec47527400682bd3bbb833207b119042/julearn/api.py#L321
So we always use the extended scorer, even if the y transformer is reversible. And in this specific case, scikit-learn transforms the y_pred
back to the original space and julearn transforms the y_true
to the transformed space, comparing bananas with potatoes.
Also want to report something I observed before: Although I got the wrong scaled metrics when z-score target, but I found the Pearson correlation values for z-score target or not are the same. Is that expected? Since I found the metrics are always different when I z-score target or not by myself. Also example: https://chat.openai.com/share/f625997a-eb50-40af-9cbb-89d450cdb364
Is there an existing issue for this?
Current Behavior
Using z-scoring leads to a wrong scoring as probably we evaluate the correctly inverse-transformed prediction to a scored ground truth. You can see that as r2_corr seems fine but r2 shows a high error as its scale sensitive. See the following image.
Expected Behavior
scoring with inversible scorers scores against the original ground truth
Steps To Reproduce
Environment
Relevant log output
No response
Anything else?
No response