cmrivers / epipy

Python tools for epidemiology
136 stars 41 forks source link

Cochran-Mantel-Haenszel #19

Open opioiddatalab opened 4 years ago

opioiddatalab commented 4 years ago

Thanks for this package, Caitlin. Do you have a Cochran-Mantel-Haenszel function in there? Seems like it could be useful next to the 2x2 tables. Thanks in advance!

elofgren commented 4 years ago

I'll see about adding this. It'll take a little thinking, because the function needs to be able to take an arbitrary number of strata to be fully functional, but it should be doable.

@cmrivers Feel free to assign this to me if you feel like it.

elofgren commented 4 years ago

Thinking more about this, as I'm procrastinating from doing actual work.

What's the preferred way to take in an arbitrary number of 2x2 tables, each of which is a mini-pandas data frame?

Ignoring my instinct to just make everything a numpy array, it feels like the easiest way, given there's no ordering to the strata, would be a dictionary of 2x2 tables?

cmrivers commented 4 years ago

Agree. You could do a pd dataframe with fancy indexing but that would be hideous.

On Thu, Jan 16, 2020 at 7:21 PM Eric Lofgren notifications@github.com wrote:

Thinking more about this, as I'm procrastinating from doing actual work.

What's the preferred way to take in an arbitrary number of 2x2 tables, each of which is a mini-pandas data frame?

Ignoring my instinct to just make everything a numpy array, it feels like the easiest way, given there's no ordering to the strata, would be a dictionary of 2x2 tables?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cmrivers/epipy/issues/19?email_source=notifications&email_token=AAJ555SURFMH4BFJYG5CMODQ6D2W7A5CNFSM4KHXIRT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJGAV2A#issuecomment-575408872, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ555TYCIATQJTGIURVAJDQ6D2W7ANCNFSM4KHXIRTQ .

elofgren commented 4 years ago

Dictionaries it is. I'll start tinkering.

opioiddatalab commented 4 years ago

Thanks to both of you!

One feature that would be great to have would be to control how zero cells are handled to avoid division by zero. Biostats folks I work with sometimes add 0.1 or 0.25 to ALL 4 cells (within a stratum) to avoid division by zero. I guess in a dictionary that’s easier to control? (Although my data are in a pandas df, I think a quick collapse could get the inputs for a dictionary command.)

I think there is a Breslow-Day solution for handling ties that is widely accepted?

Thanks again!


From: Eric Lofgren notifications@github.com Sent: Friday, January 17, 2020 12:50:46 AM To: cmrivers/epipy epipy@noreply.github.com Cc: Dasgupta, Nabarun nab@email.unc.edu; Author author@noreply.github.com Subject: Re: [cmrivers/epipy] Cochran-Mantel-Haenszel (#19)

Dictionaries it is. I'll start tinkering.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/cmrivers/epipy/issues/19?email_source=notifications&email_token=ANC3GFWU6B4AJJSBX37BCVDQ6FBLNA5CNFSM4KHXIRT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJGQ64I#issuecomment-575475569, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANC3GFUADAY556SLSYFW7QTQ6FBLNANCNFSM4KHXIRTQ.