Open DavidSorge opened 3 years ago
Hi @DavidSorge - are you able to supply a small sample dataframe that replicates the problem? That would help debug this.
Yes, that should be pretty easy.
Here's a code snippet that reproduces the problem:
base_df_sample = pd.read_csv('base_df_sample.csv', index_col=0)
covariate_sample = pd.read_csv('covariate_sample.csv')
print('Before merge:')
print(base_sample.event.value_counts())
print()
integrated = add_covariate_to_timeline(
base_df_sample,
covariate_sample,
duration_col='start_day',
id_col='PC_ID',
event_col='event'
)
print('After merge:')
print(integrated.event.value_counts())
The sample files are attached. They are from two of the geographical units that have the most events (in this case, incidents of unrest). Being incidents of unrest, the events in my model are non-absorbing, (ie, a single district can experience multiple events).
The csv files with the snippets are attached here:
The output I get from running the snippet is:
Before merge:
True 41
False 2
Name: event, dtype: int64
After merge:
False 1420
Name: event, dtype: int64
(Also, for the record, thank you for both (a) a great tool--I'm really grateful to be able to do my Survival Analysis in python rather than having to jump over to R-- and (b) your amazingly fast response time--right after I initially posted the issue!
I'm working on a time-varying cox regression analysis. I'm attempting to add time varying covariates to a dataset in long format, following the examples here as closely as possible.
When using the
add_covariate_to_timeline()
function, however, all the time periods in the resultant dataset come back withFalse
in the event column.For reference, here's the base df:
and here's the covariate df:
I'm wondering if it has to do with the covariate df being a 'cumulative event' type df? Not sure, so I thought I'd ask.
I'm using lifelines version 0.26.3
Thanks so much!