open-risk / transitionMatrix

Statistical analysis and visualization of state transition phenomena
https://www.openriskmanagement.com/tags/transition-matrix/
Apache License 2.0
86 stars 32 forks source link

TypeError: object of type 'NoneType' has no len() #23

Open karakastarik opened 3 years ago

karakastarik commented 3 years ago

Hi, thanks for this amazing package.

When i did pip install transitionMatrix all modules couldn't be installed. Anyway, i installed from github but there is an another issue here:

TypeError                                 Traceback (most recent call last)
<ipython-input-13-bb59c58d3d9a> in <module>
     20 myEstimator = es.CohortEstimator(states=myState, ci={'method': 'goodman', 'alpha': 0.05})
     21 # myMatrix = matrix.CohortEstimator(states=myState)
---> 22 result = myEstimator.fit(sorted_data)
     23 #myEstimator.summary()

~\Downloads\transitionMatrix-master\transitionMatrix-master\transitionMatrix\estimators\cohort_estimator.py in fit(self, data, labels)
     92         # The number of cohorts is the number of intervals
     93         # Minimally two (initial and final)
---> 94         cohort_dim = len(self.cohort_bounds) - 1
     95         event_count = data[id_label].count()
     96 

TypeError: object of type 'NoneType' has no len()

Thanks.

open-risk commented 3 years ago

Thanks for reporting these issues. The current PyPI release is a bit dated. The immediate next milestone is to further test the current version and have a 0.5 release

KonScanner commented 2 years ago

It seems that
CohortEstimator(states=myState, ci={'method': 'goodman', 'alpha': 0.05}) when initialized: def __init__(self, cohort_bounds=None, states=None, ci=None)

So it expects cohort_bounds. When not passed in the default is None.

I've tried modifying the fit() function to work with the old method:

 # Old way of enumerating cohort intervals was using labels
cohort_labels = data[timestep_label].unique()
cohort_dim = len(cohort_labels) - 1

But it causes further issues with: tmn_count[(entity_state[i], entity_state[i + 1], event_time[i])] += 1

at least in my usecase.

KonScanner commented 2 years ago

More specifically:

State Space
--------------------------------------------------------------------------------
State Index and Label:  0 ,  0
State Index and Label:  1 ,  1
State Index and Label:  2 ,  2
State Index and Label:  3 ,  3
State Index and Label:  4 ,  4
State Index and Label:  5 ,  5
State Index and Label:  6 ,  6
State Index and Label:  7 ,  8

Example df post datetime_to_float

    ID  Time State  EventTime  Count
0  1         0     0   0.000000    1.0
1  1         1     0   0.063655    1.0
2  1         2     0   0.248460    3.0
3  1         3     0   0.373717    2.0
4  1         4     0   0.498973    2.0
...

When trying to do the following:

cohort_data, cohort_intervals = tm.utils.bin_timestamps(data, cohorts=8)
myEstimator = es.CohortEstimator(states=myState, ci={'method': 'goodman', 'alpha': 0.05})
print(cohort_data.head())
result = myEstimator.fit(cohort_data)

it fails on the estimator.fit as such:

tmn_count[(entity_state[i], entity_state[i + 1], event_time[i])] += 1
IndexError: index 8 is out of bounds for axis 1 with size 8

I am not sure why it is miss-indexing.

Maybe in my given state space, it doesn't initialize some parameters correctly.?

Any help would be appreciated!!!

P.S. For cohort_bounds I've restored this:

# Old way of enumerating cohort intervals was using labels
cohort_labels = data[timestep_label].unique()
cohort_dim = len(cohort_labels) - 1