Closed Nirvana2211 closed 2 months ago
Experiencing the same problem, have not found a solution yet.
Sorry for the late response, I'll look into this later tonight.
Sorry for the late response, I'll look into this later tonight.
Hi AnotherSamWilson, thank you! I am having problems in general getting plots out. I have fitted a kernel to a dataset with some 140 features, where we only impute 43 of the features (i put in a list in variable_schema to indicate this). Plotting all of them is obviously not very pretty, but i notice that if i do, it plots 49 figures, 6 of them obviously empty. So, then i tried a for loop:
step = 5 # plot 5 variables at a time for i in range(0, len(imputable), step): kernel.plot_mean_convergence(variables=imputable[i:i+step], wspace=1.6, hspace=1.8)
And weirdly enough, each plot has at least one empty figure. Like the one below:
Not sure what to do about it.
In addition, how do i get feature names on the above mean_convergence plot?
Thank you for your time.
When I last looked into these multiplots, I couldn't figure out how to prevent those empty plots from showing up... I'm not sure if it's possible. Either way, I want to get away from raw matplotlib in most plots, it's too much of a hassle.
When I last looked into these multiplots, I couldn't figure out how to prevent those empty plots from showing up... I'm not sure if it's possible. Either way, I want to get away from raw matplotlib in most plots, it's too much of a hassle.
Okay, fair enough!
Do you have any suggestions for .save_kernel() when it returns an error like this:
raise ValueError("%s cannot be larger than %d bytes" % ValueError: bytesobj cannot be larger than 2147483631 bytes
At first, it stated I needed an optional dependency such as pyarrow/fastparquet, I went with pyarrow.
I believe the problem here is that variables=None
by default, but then that is passed to _get_var_ind_from_list
which requires an actual list. Presumably it should first sanitize and generate the full list of variables as is done in plot_imputed_distributions
.
This shouldn't be a problem in major version 6. The plotting functionality doesn't exist yet, but it should be much easier to implement with plotnine
.
I am using mice forest version 5.6.2 on windows. I am trying to replicate the iris example.
import miceforest as mf from sklearn.datasets import load_iris import pandas as pd
iris = pd.concat(load_iris(as_frame = True, return_X_y = True), axis = 1) iris.rename(columns = {'target' :'species'}, inplace = True) iris['species'] = iris['species'].astype('category')
iris_amp = mf.ampute_data(iris, perc = 0.25, random_state = 1991)
kernel = mf.ImputationKernel( data=iris_amp, datasets=5, save_all_iterations=True, random_state=1991)
kernel.mice(3, verbose = True)
kernel.plot_correlations() gives the following error:
TypeError Traceback (most recent call last)