Closed Ivy-ops closed 2 months ago
Thanks for your report.
I'm not sure what's unclear exactly. What do you suggest should be clarified precisely, and can you maybe make some suggestions of better ways to explain that?
Hi @xrobin , Thanks for the reply. Based on the tutorial:
">”: if the predictor values for the control group are higher than the values of the case group (controls > t >= cases) “<”: if the predictor values for the control group are lower or equal than the values of the case group (controls < t <= cases).
In my case: Does direction mean: when I calculate the 1st sample[the prediction probability for Control=0.24642643; Case=0.7535736], if I use threshold=0.5 and direction ">", direction means: 0.7535736> 0.5, sample 1 will be predicted as "Case"? If I use threshold = 0.5 and direction "<", what does direction mean? Thank you for your patience!
I attempted to clarify the documentation. Here is the new description of direction
:
how are positive observations defined? “<”: observations are positive when they are greater than or equal (>=) to the threshold. “>”: observations are positive when they are smaller than or equal (<=) to the threshold. “auto” (default): automatically detect in which group the median is higher and take the direction accordingly. See details. You should set this explicity to “>” or “<” whenever you are resampling or randomizing the data, otherwise the curves will be biased towards higher AUC values.
Is it clearer like this?
Hi developer, I am trying to use roc() function with my dataset; after reading the description of the "direction", I still can not understand what does this mean. It would be highly appreciated if you can help me with this: I use random forest and get the probability of each sample(shown below), the second column is for "Case" group. My dataset rf$prediction: Control Case [1,] 0.24642643 0.7535736 [2,] 0.33507026 0.6649297 [3,] 0.45731121 0.5426888 [4,] 0.46547831 0.5345217 [5,] 0.53042247 0.4695775 [6,] 0.31020475 0.6897952 [7,] 0.15786178 0.8421382 [8,] 0.15340136 0.8465986 [9,] 0.15774135 0.8422587 [10,] 0.18421489 0.8157851 [11,] 0.64663338 0.3533666 [12,] 0.40697185 0.5930282 [13,] 0.37198661 0.6280134 [14,] 0.57076432 0.4292357 [15,] 0.18086131 0.8191387 [16,] 0.58201416 0.4179858 [17,] 0.19227444 0.8077256 [18,] 0.46165459 0.5383454 [19,] 0.19301864 0.8069814 [20,] 0.66767106 0.3323289 [21,] 0.80801017 0.1919898 [22,] 0.66952125 0.3304788 [23,] 0.62995097 0.3700490 [24,] 0.50042121 0.4995788 [25,] 0.77477208 0.2252279 [26,] 0.60949394 0.3905061 [27,] 0.82625698 0.1737430 [28,] 0.65935287 0.3406471 [29,] 0.07350427 0.9264957 [30,] 0.72550278 0.2744972 [31,] 0.72104726 0.2789527 [32,] 0.65799964 0.3420004 [33,] 0.70231445 0.2976856 [34,] 0.32174162 0.6782584 [35,] 0.86845567 0.1315443 [36,] 0.50935250 0.4906475 [37,] 0.44772867 0.5522713 [38,] 0.78675787 0.2132421
Then I use roc function:
As we can see in the above code, I can have 2 different AUCs. I refer to the tutorial of roc() and https://stackoverflow.com/questions/31756682/what-does-coercing-the-direction-argument-input-in-roc-function-package-proc that mentioned about direction means probability < |> the threshold.
Does direction mean: when I calculate the 1st sample, if I use threshold=0.5 and direction ">", direction means 0.7535736> 0.5, sample 1 will be predicted as "Case"? If I use threshold = 0.5 and direction "<", what does direction mean? Too confused. When to use ">" and when to use "<"? Looking forward to your help! Much appreciated!