ZongSingHuang / Binary-Whale-Optimization-Algorithm

Tawhid, M.A., Ibrahim, A.M. Feature selection based on rough set approach, wrapper approach, and binary whale optimization algorithm. Int. J. Mach. Learn. & Cyber. 11, 573–602 (2020). https://doi.org/10.1007/s13042-019-00996-5
MIT License
15 stars 4 forks source link

Ask a question? #1

Closed Njzjhd closed 3 years ago

Njzjhd commented 3 years ago

def Breastcancer_test(x): if x.ndim==1: x = x.reshape(1, -1) loss = np.zeros(x.shape[0]) #array([0.])

for i in range(x.shape[0]):
    if np.sum(x[i, :])>0:
        knn = KNeighborsClassifier(n_neighbors=5).fit(X_train[:, x[i, :]], y_train)
        score = accuracy_score(knn.predict(X_test[:, x[i, :]]), y_test)
        loss[i] = 0.99*(1-score) + 0.01*(np.sum(x[i, :])/X_train.shape[1])
    else:
        loss[i] = np.inf
        print(666)
return loss

For the fitness function, is there a computational error in the for loop, there should be only one value, right? Is it possible to introduce cross-validation to further remedy the defect?

ZongSingHuang commented 3 years ago

Thanks for your response

  1. if np.all(x[i])==0, meaning no feature selected, so I put np.inf to loss[i]
  2. I'm following the literature experiment setting, so main_5050.py have not adopted kfold
  3. Finally, main_5050.py have adopted kfold

image

Njzjhd commented 3 years ago

Thank you very mach!

Njzjhd commented 3 years ago

X_train[:, optimizer.gBest_X] This code does not capture valid features and may have problems. Usually 1 is selected, but this is not the case.

Njzjhd commented 3 years ago

I have transformed the above problem, which is a bit complicated, but can effectively solve the problem

Get the index equal to 1

def unique_index(L,f): return [i for (i,v) in enumerate(L) if v==f] t_cols = unique_index(tt,1)

Add features with index equal to 1 to a new dataset

tr_cols = X_train.columns.to_list() tr_add = [] for i in range(len(t_cols)): tr_add.append(tr_cols[t_cols[i]])

ZongSingHuang commented 3 years ago

I have fix this problem, thank you image image

Njzjhd commented 3 years ago

For " for i in range(x.shape[0]):", i seems to be equal to 0 only. It seems to be impossible to implement a for loop, the author can do the test. Also what is the meaning of the code " if x.ndim==1: x = x.reshape(1, -1) loss = np.zeros(x.shape[0]) #array([0.])"? This code must be necessary, but I don't understand its meaning

Njzjhd commented 3 years ago

In addition, I would like to discuss one thing with the author. The above study addresses the classification problem. Although it is a reproduction of the paper, if it is for the regression problem, could the fitness function be modified as follows. " def T_fitness(x): if x.ndim==1: x = x.reshape(1, -1) loss = np.zeros(x.shape[0])

for i in range(x.shape[0]):
    if np.sum(x[i, :]) > 0:
        knn = LGBMRegressor(random_state=44).fit(x_train[:, x[i, :]], y_train)

        loss[i] = r2_score(y_test,knn.predict(X_test[:, x[i, :].astype(bool)]))
    else:
        loss[i] = np.inf
        print(666)
return loss

" For the evaluation function in skelern, the input format is "" acc(y_test, y_pre)", Please take note

ZongSingHuang commented 3 years ago

For " for i in range(x.shape[0]):", i seems to be equal to 0 only. It seems to be impossible to implement a for loop, the author can do the test. Also what is the meaning of the code " if x.ndim==1: x = x.reshape(1, -1) loss = np.zeros(x.shape[0]) #array([0.])"? This code must be necessary, but I don't understand its meaning

1. for i in range(x.shape[0]): x is population matrix, row is number of agent, column is feature

  1. if x.ndim==1: x = x.reshape(1, -1) This is a protective measure if x is a single agnet and dim=1, I don’t remember a bit of a habit I developed when implementing which paper

Njzjhd commented 3 years ago

for i in range(x.shape[0]): For the reproduced code, it seems that there is always only one line in '' range(x.shape[0])'''. That means i is 0 and nothing else. I printed it out and looked at it

ZongSingHuang commented 3 years ago

In addition, I would like to discuss one thing with the author. The above study addresses the classification problem. Although it is a reproduction of the paper, if it is for the regression problem, could the fitness function be modified as follows. " def T_fitness(x): if x.ndim==1: x = x.reshape(1, -1) loss = np.zeros(x.shape[0])

for i in range(x.shape[0]):
    if np.sum(x[i, :]) > 0:
        knn = LGBMRegressor(random_state=44).fit(x_train[:, x[i, :]], y_train)

        loss[i] = r2_score(y_test,knn.predict(X_test[:, x[i, :].astype(bool)]))
    else:
        loss[i] = np.inf
        print(666)
return loss

" For the evaluation function in skelern, the input format is "" acc(y_test, y_pre)", Please take note

Of course, below are some examples, but when it comes to company secrets, I can only show part of the content for MLR image for SVR(rbf) image

reference: A wrapper approach-based key temperature point selection and thermal error modeling method A distributed PSO–SVM hybrid system with feature selection and parameter optimization Particle Swarm Optimization-Based Support Vector Regression for Tourist Arrivals Forecasting Feature selection and parameter optimization of support vector regression for electric load forecasting

ZongSingHuang commented 3 years ago

for i in range(x.shape[0]): For the reproduced code, it seems that there is always only one line in '' range(x.shape[0])'''. That means i is 0 and nothing else. I printed it out and looked at it

Because it involves the author’s original design architecture, I designed the loop in BWOA.py image

If you are interested, you can look at my other projects, such as S-shaped-Binary-Whale-Optimization-Algorithm

Njzjhd commented 3 years ago

Thank you very much, 

------------------ 原始邮件 ------------------ 发件人: "ZongSingHuang/Binary-Whale-Optimization-Algorithm" <notifications@github.com>; 发送时间: 2021年1月22日(星期五) 晚上9:40 收件人: "ZongSingHuang/Binary-Whale-Optimization-Algorithm"<Binary-Whale-Optimization-Algorithm@noreply.github.com>; 抄送: "至爱❤️"<2318109878@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [ZongSingHuang/Binary-Whale-Optimization-Algorithm] Ask a question? (#1)

for i in range(x.shape[0]): For the reproduced code, it seems that there is always only one line in '' range(x.shape[0])'''. That means i is 0 and nothing else. I printed it out and looked at it

Because it involves the author’s original design architecture, I designed the loop in BWOA.py

If you are interested, you can look at my other projects, such as S-shaped-Binary-Whale-Optimization-Algorithm

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

ZongSingHuang commented 3 years ago

Thank you very much,  ------------------ 原始邮件 ------------------ 发件人: "ZongSingHuang/Binary-Whale-Optimization-Algorithm" <notifications@github.com>; 发送时间: 2021年1月22日(星期五) 晚上9:40 收件人: "ZongSingHuang/Binary-Whale-Optimization-Algorithm"<Binary-Whale-Optimization-Algorithm@noreply.github.com>; 抄送: "至爱❤️"<2318109878@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [ZongSingHuang/Binary-Whale-Optimization-Algorithm] Ask a question? (#1) for i in range(x.shape[0]): For the reproduced code, it seems that there is always only one line in '' range(x.shape[0])'''. That means i is 0 and nothing else. I printed it out and looked at it Because it involves the author’s original design architecture, I designed the loop in BWOA.py If you are interested, you can look at my other projects, such as S-shaped-Binary-Whale-Optimization-Algorithm — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

You welcome