Closed nsankar closed 3 years ago
Hi @nsankar ! You’re absolutely correct, there should never be negative odds ratios. Hmmm. Would you be willing to send me a sample of your data so I can do some debugging?
@ronikobrosly how can I reach you on the email
@nsankar My email is roni.kobrosly@gmail.com
Hi @nsankar , I think I might understand the issue. Did you use gps.estimate_log_odds
to generate the image #1? If so, what you generated was an array of log-odds, which can possibly range from -∞ to ∞. So the negative results you observed would be possible fine.
If you're looking for to generate odds ratios using the lowest treatment value as a reference (the preferred way to use this GPS_Classifier), you should use the calculate_CDRC
method.
So your workflow would look something like this:
gps = GPS_Classifier()
gps.fit(T = df['t'], X = df['x'], y = df['y'])
gps_results = gps.calculate_CDRC(0.95)
Where gps_results
will contain a column of the odds ratios. As mentioned here, the odds ratios generated with this function give you a sense of the relative odds of a treatment value causing the highest outcome class to occur relative to the lowest treatment value. So if you want to see the causal effect of a treatment value of 20.0
and the lowest treatment value happens to be 10.0
, the odds ratio at 20.0 will represent:
odds of higher outcome class occuring at treatment = 20.0
/ odds of higher outcome class occuring at treatment = 10.0
If the odds ratio here is 1.0
, that tells you a treatment value of 20.0
does nothing different over the effect of a treatment value of 10.0
. If the odds ratio is 5
, then the treatment value of 20.0
had 5 times the effect of a treatment value of 10.0
. So it provides relative treatment effects, relative to the lowest treatment value. These odds ratios should always be bound between 0 to ∞. They will never be negative.
Now, the gps.estimate_log_odds
produces something different. It is not relative to any treatment value. It simply gives you the log odds of the higher outcome class occurring at a provided treatment value. Again, these values can possibly range from -∞ to ∞ and are more difficult to interpret.
Does this help? Or did I miss the point?
@ronikobrosly Noted. Yes.I had used the gps.estimate_log_odds method to predict and to plot the image. I get your point. I will go through gps.calculate_CDRC function and try . This really helps. Thanks for the insights.
Great! Feel free to close the issue if that’s it, or let me know if you have any other questions.
Hi, Hope you are doing great. When I used GPS classifier on a sensor anomaly data where X has the covariates (continuous numeric variables) and T is a continuous treatment variable (specific sensor data that probably caused the anomaly) and Y (outcome) is the anomaly labels(binary outcome , 1 anomaly and 0 normal) and When I used gps.estimate_log_odds prediction API
I get negative values (odds-ratio) for some of the Treatment variables. Below is one example. (image #1)
I believe negative values are incorrect? Am I missing something?
Also, How should I interpret odds-ratio values that has an arbitratry min/max range for a range of Treatment variables predicted using gps.estimate_log_odds ? (Pls. see image #2 below as an example)
Thank you in advance.