Closed Aritro-94 closed 4 years ago
Could you please provide the code and also a sample data?
dataset: titanic.zip `import pandas as pd import numpy as np
df_orig=pd.read_csv("titanic.csv",sep=",") df_orig.head()
df=df_orig.drop(["Name"],axis=1) df1=pd.get_dummies(df) df1.drop(["Sex_female"],axis=1,inplace=True) ohe=pd.get_dummies(df["Pclass"],prefix="Pclass") ohe.drop(["Pclass_3"],axis=1,inplace=True) df1=df1.join(ohe) df1.drop(["Pclass"],axis=1,inplace=True)
from sklearn.model_selection import train_test_split X=df1.iloc[:,1:] y=df1.loc[:,["Survived"]] x_tr,x_te,y_tr,y_te=train_test_split(X,y,test_size=0.25,random_state=11)
from xverse.ensemble import VotingSelector clf=VotingSelector(selection_techniques=['RF', 'RFE', 'ETC', 'CS', 'L_ONE']) clf.fit(x_tr,y_tr) # At this step the error occurs`
Finally I figured it out. It happens because of the shape of your target variable needs to be changed.
The program is expecting a shape like this (len(data),) whereas the target shape using the code above generated a shape (len(data,1)). Because of that, the binning function did not work. Please add these two lines of code after you perform the train test split. Then it will work as intended. I will fix this in the future release, so that you dont have to add it anymore. Thanks.
y_tr = y_tr.T.squeeze()
y_te = y_te.T.squeeze()
I am trying to perform feature selection using xverse's VotingSelector on Titanic data set, which is a binary classification problem, and the dataset also contains categorical variables whcih i have one-hot-encoded. I am repeatedly facing this error. I am using xverse version 1.0.5 and the Python version is 3.6. Kindly help.