theislab / ehrapy

Electronic Health Record Analysis with Python.
https://ehrapy.readthedocs.io/
Apache License 2.0
137 stars 17 forks source link

Extended Table 1 #747

Open mhaist94 opened 3 months ago

mhaist94 commented 3 months ago

Description of feature

Me again...

Its fantastic that you already have the cohort tracker and table one features in ehrapy (those are super useful particularly for standardizing and objectifying clinical cohort selection and visual confirmation of filtering steps!).

As ehrapy could be used i.e. for biomarker discovery one thing that would be required to show is that the biomarker is an independent determinant of the investigated outcome.

Besides the statistics that you already implemented into therapy one confirmatory representation that is often required in scientific publications if you compare two or three groups stratified by a given parameter is a table similar to that of table one (see example in Table 1 here: https://jitc.bmj.com/content/11/9/e007630#DC1). Here you both list the main clinical and pathological baseline characteristics (similar to your Table one) but additionally compare the main clinical and pathological baseline characteristics (such as age, gender, lab values, treatments etc) of the cohort with corresponding statistics added to this comparison in a separate column. That´s likely one of the most annoying parts in clinical publications especially if you do not rely on scripting language - putting this into a python package would for many clinical researcher contribute to massive time savings.

The part that is a little tricky though is that variables in this two-group comparison table are often differently scaled (nominal, ordinal, continous) thus requiring to loop through each of the variables by the required test (i.e. Fishers exact/ chi-square for ordinally scaled and t-test or Mann-Whitney U for continously scaled). This is definetly not essential but a nice to have-tool which would likely make clinicians stick to ehrapy as it would incorporate all the essentials of dissecting big clinical datasets.

Zethson commented 3 months ago

These are so super helpful! I won't respond to all of them now, but be assured that we'll tackle them eventually. Thanks!

mhaist94 commented 3 months ago

Great, happy to give some helpful feedback! I´m sure you guys are busy incorporating other big challenges - but I reasoned those ideas might be sort of a quick fix that could nonetheless make a difference in terms of the scope of the framework