I am using modAL for an active learning project in multi-label classification. My implementation is in PyTorch, and I use DinoV2 as the backbone model.
For the same dataset, I apply both active learning (using minimum confidence and average confidence strategies) and random sampling. I select the same number of samples in both strategies, but the results from random sampling are significantly better than those from the active learning approach. I would like to know if this discrepancy might be due to an issue with my code or the modAL library's handling of multi-label classification. Below is my active learning loop:
for i in range(n_queries):
if i == 12:
n_instances = X_pool.shape[0]
else:
n_instances = batch(int(np.ceil(np.power(10, POWER))), BATCH_SIZE)
print(f"\nQuery {i + 1}: Requesting {n_instances} samples from a pool of size {X_pool.shape[0]}")
if X_pool.shape[0] < n_instances:
print("Not enough samples left in the pool to query the desired number of instances.")
break
query_idx, _ = learner.query(X_pool, n_instances=n_instances)
query_idx = np.unique(query_idx)
if len(query_idx) == 0:
print("No indices were selected, which may indicate an issue with the query function or pool.")
continue
# Add the newly selected samples to the cumulative training set
cumulative_X_train.append(X_pool[query_idx])
cumulative_y_train.append(y_pool[query_idx])
# Concatenate all the samples to form the cumulative training data
X_train_cumulative = np.concatenate(cumulative_X_train, axis=0)
y_train_cumulative = np.concatenate(cumulative_y_train, axis=0)
learner.teach(X_train_cumulative, y_train_cumulative)
# Log the selected sample names
selected_sample_names = train_df.loc[query_idx, "image"].tolist()
print(f"Selected samples in Query {i + 1}: {selected_sample_names}")
with open(samples_log_file, mode='a', newline='') as f:
writer = csv.writer(f)
writer.writerow([i + 1] + selected_sample_names)
# Remove the selected samples from the pool
X_pool = np.delete(X_pool, query_idx, axis=0)
y_pool = np.delete(y_pool, query_idx, axis=0)
# Evaluate the model
y_pred = learner.predict(X_test_np)
accuracy = accuracy_score(y_test_np, y_pred)
f1 = f1_score(y_test_np, y_pred, average='macro')
acc_test_data.append(accuracy)
f1_test_data.append(f1)
print(f"Accuracy after query {i + 1}: {accuracy}")
print(f"F1 Score after query {i + 1}: {f1}")
# Early stopping logic
if f1 > best_f1_score:
best_f1_score = f1
wait = 0
else:
wait += 1
if wait >= patience:
print(f"Stopping early after {i + 1} queries due to no improvement in F1 score.")
break
total_samples += len(query_idx)
print(f"Total samples used for training after query {i + 1}: {total_samples}")
POWER += 0.25
torch.cuda.empty_cache()
I am using modAL for an active learning project in multi-label classification. My implementation is in PyTorch, and I use DinoV2 as the backbone model. For the same dataset, I apply both active learning (using minimum confidence and average confidence strategies) and random sampling. I select the same number of samples in both strategies, but the results from random sampling are significantly better than those from the active learning approach. I would like to know if this discrepancy might be due to an issue with my code or the modAL library's handling of multi-label classification. Below is my active learning loop: