Open MattHodgman opened 1 year ago
Hello, the numeric_mask
is generated from the is_numeric
function in helpers.py
:
https://github.com/MLD3/FIDDLE/blob/master/FIDDLE/helpers.py#L191
on this line:
https://github.com/MLD3/FIDDLE/blob/master/FIDDLE/steps.py#L601
I agree with your logic, so it is indeed surprising if input_data.p
does not contain None
/NaN
but numeric_mask
contains NaN
. Perhaps you could try with a small example with/without nans and apply the is_numeric
function to that column?
is_numeric
works when I extract the 225958 feature column from input_data.p
to col_data
and run
numeric_mask = col_data.apply(is_numeric)
numeric_mask
only contains True
and False
values. When I switch one of these booleans to np.nan
or a float
I can reproduce the error. I'm going to see if I can extract the ts_mixed
dataframe from https://github.com/MLD3/FIDDLE/blob/master/FIDDLE/steps.py#L594 and look at feature 225958.
I am running FIDDLE on data extracted from MIMIC-III using the pipeline outlined in FIDDLE-experiments. I have my population of ICU stays and am running FIDDLE with these parameters:
--T=240.0 --dt=1.0 --theta_1=0.003 --theta_2=0.003 --theta_freq=1 --stats_functions 'mean'
and other default ones found in
run_make_all.sh
.I get the following error:
Do you know what could be causing this error? I was able to determine that it first occurs in the column 225958 and
numeric_mask
contains at least one NaN value which must mean column 225958 containsNone
values however in in myinput_data.p
file there are noNone
or NaNvariable_values
forvariable_name == '225958'
.