Closed sarakolding closed 1 year ago
seems to only be a problem for us in plot_metric_by_time_until_diagnosis, not in plot_auc_by_time_from_first_visit, for some reason
Tests | Skipped | Failures | Errors | Time |
---|---|---|---|---|
1 | 0 :zzz: | 0 :x: | 1 :fire: | 5.216s :stopwatch: |
Huh, that's a bit of a mess! Thanks for taking a look - I'll dive in later today.
Since index is only reset for a single column that is then concatenated back onto the dataset (and I think this is done by matching on indices), a lot of values from the [n_in_bin] column will not have any matches
Oh, we need to change this if it happens so that it merges on bin.
I've also added a few tests to hopefully stabilise this interface.
Notes for reviewers
I think I found out what is happening. Since index is only reset for a single column that is then concatenated back onto the dataset (and I think this is done by matching on indices), a lot of values from the [n_in_bin] column will not have any matches, or will be matched with the wrong bin. I would have done it at the end instead, but the columns that are outputted are also concatenated onto dataframes in the plots, so I would suspect it would only move the issue. Maybe the fix is to reset the index after these steps in the plot functions? You probably have a better overview of what would work best