Riashat / Active-Learning-Bayesian-Convolutional-Neural-Networks

Active Learning on Image Data using Bayesian ConvNets
136 stars 45 forks source link

why is it invalid when applying active learning to the cnn model ? #1

Open houxingxing opened 7 years ago

houxingxing commented 7 years ago

can you give me some advises? I want to realize that applying traditional active learning method to cnn model, such as maximal entropy, but I fail.

[network] `

model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same',
                        input_shape=X_train.shape[1:]))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))

# let's train the model using SGD + momentum (how original).
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])

` active sampling function:

` def getData(proba,data,label,batch_data,batch_label,num,flag): tmpdata=np.empty((num,3,32,32),dtype='float32') tmplabel=np.empty((num,10),dtype='uint8') if num==batch_size: Class_Log_Probability = np.log2(proba) Entropy_Each_Cell = - np.multiply(proba ,Class_Log_Probability) Entropy = np.sum(Entropy_Each_Cell, axis=1) index=select_sort(Entropy,num,flag) else: index=get_index(flag) print(index)

for i in range(num):
    t=index[i]
    flag[t]=1
    tmpdata[i]=data[t]
    tmplabel[i]=label[t]
batch_data,batch_label=np.vstack([batch_data,tmpdata]),np.vstack([batch_label,tmplabel])
return batch_data,batch_label,data,label,flag

def select_sort(list_proba,num,flag): list_len=len(list_proba) index=[] while len(index)<num: max_index = -1 max_value=-10 for j in range(0, list_len): if(list_proba[j]>max_value) and j not in index and flag[j]==0: max_index=j max_value=list_proba[j] index.append(max_index) return index ` dataset: cifar10

the result: random methods is better. why?

mariobecerra commented 6 years ago

Could you solve your problem, @houxingxing ? I've been having it too. I'm trying to replicate their paper using the MNIST dataset, and I find that a random acquisition is as good as using any of the Bayesian acquisition functions (BALD, max entropy, variation ratios). I can't find why this could be.