Closed Luis-vllgh closed 1 year ago
Hi Luis,
Which version are you using?
Thanks, Adrienne
From: Luis Vollerigh @.> Date: Monday, December 5, 2022 at 09:08 To: adriennekline/psmpy @.> Cc: Subscribed @.***> Subject: [adriennekline/psmpy] Float division by zero in psm.logistic_ps (Issue #1)
Hi there!
I am currently using PsmPy on a rather large dataset and I got the following error in the psm.logistic_ps command:
This seems to occur during the calculation of the propensity_logit column. I dont think that one or many of the propensity scores are acutally equal to 1, but maybe the propensity score from the package is rounded at some stage of the algorithm? If I choose a slightly smaller dataset it runs and the highest propensity score ends up to be "0.9999986309511635".
Do you have an idea how to fix this? Would help me out a lot! Thanks in advance
— Reply to this email directly, view it on GitHubhttps://github.com/adriennekline/psmpy/issues/1, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AI7D64UJGAFXYCJHZ3TYYOTWLYAOHANCNFSM6AAAAAASUMJROA. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi,
I am using PsmPy 0.3.6.
Best regards, Luis
I see the problem – I’ll have to make an update to the package. Will do so tonight and upload a new version and that should work just fine. Out of curiosity, how big is your dataset?
Adrienne
From: Luis Vollerigh @.> Date: Monday, December 5, 2022 at 09:15 To: adriennekline/psmpy @.> Cc: adriennekline @.>, Comment @.> Subject: Re: [adriennekline/psmpy] Float division by zero in psm.logistic_ps (Issue #1)
Hi,
I am using PsmPy 0.3.6.
Best regards, Luis
— Reply to this email directly, view it on GitHubhttps://github.com/adriennekline/psmpy/issues/1#issuecomment-1337555088, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AI7D64WVOYD7EHUBT72Z26DWLYBJTANCNFSM6AAAAAASUMJROA. You are receiving this because you commented.Message ID: @.***>
Thanks for your fast response! That is great, I am looking forward to the update.
Right now I get the error at 750.000 rows, but that may be extended to >20 mio. rows
Ok. As you can probably guess, propensity score matching relies on a KNN. And if you have > 20 million rows this will likely become incomputable (in the second step i.e. matching).
On Mon, Dec 5, 2022 at 9:23 AM Luis Vollerigh @.***> wrote:
Thanks for your fast response! That is great, I am looking forward to the update.
Right now I get the error at 750.000 rows, but that may be extended to >20 mio. rows
— Reply to this email directly, view it on GitHub https://github.com/adriennekline/psmpy/issues/1#issuecomment-1337568262, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D64XN4QJSJCQIZ2FKFNTWLYCG5ANCNFSM6AAAAAASUMJROA . You are receiving this because you commented.Message ID: @.***>
Thanks for the advice! I already have an idea how to work around this problem. Will see if it works..
Great! I can implement it in the package if you find it successful - so let me know :)
On Mon, Dec 5, 2022 at 9:31 AM Luis Vollerigh @.***> wrote:
Thanks for the advice! I already have an idea how to work around this problem. Will see if it works..
— Reply to this email directly, view it on GitHub https://github.com/adriennekline/psmpy/issues/1#issuecomment-1337581320, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D64R6GEL6H5VS4QSVN7LWLYDFLANCNFSM6AAAAAASUMJROA . You are receiving this because you commented.Message ID: @.***>
I'll let you know! :)
Great! :)
I've uploaded a new version: 3.8! Please let me know if this resolves the issue!
Thanks, Adrienne
On Tue, Dec 6, 2022 at 2:23 AM Luis Vollerigh @.***> wrote:
I'll let you know! :)
— Reply to this email directly, view it on GitHub https://github.com/adriennekline/psmpy/issues/1#issuecomment-1338950814, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D64X5SXGD3VYVVACNHC3WL3ZXTANCNFSM6AAAAAASUMJROA . You are receiving this because you commented.Message ID: @.***>
Hi Adrienne, it works perfectly fine! Thanks for the update to the package :)
Hi there!
I am currently using PsmPy on a rather large dataset and I got the following error in the psm.logistic_ps command:
This seems to occur during the calculation of the propensity_logit column. I dont think that one or many of the propensity scores are acutally equal to 1, but maybe the propensity score from the package is rounded at some stage of the algorithm? If I choose a slightly smaller dataset it runs and the highest propensity score ends up to be "0.9999986309511635".
Do you have an idea how to fix this? Would help me out a lot! Thanks in advance