blab / cartography

Dimensionality reduction distills complex evolutionary relationships in seasonal influenza and SARS-CoV-2
https://doi.org/10.1101/2024.02.07.579374
MIT License
4 stars 1 forks source link

Outlier Analysis Error #7

Closed nandsra21 closed 7 months ago

nandsra21 commented 3 years ago

image

This error is coming when I have checked multiple times that the code is outputting lists of binary targets (0 and 1)..not sure what the bug is.

Here's the code:


clusterer = hdbscan.HDBSCAN(cluster_selection_epsilon=distance_threshold)
clusterer.fit(training_embedding)
clusters = clusterer.labels_.astype(str)
val_df = pd.DataFrame(clusters, columns=["clusters"])
val_df["outlier_status_predicted"] = val_df["clusters"].apply(lambda label: 1 if label=='-1' else 0)
training_clades = pd.DataFrame(training_clades, columns=["clades"])
training_clades["clades_num"] = training_clades["clades"].apply(lambda label: 1 if label=='outlier' else 0)
clusters = val_df["outlier_status_predicted"].values.tolist()
training_clades = training_clades["clades_num"].values.tolist()

training_mcc = matthews_corrcoef(
    training_clades,
    clusters
)
results["training_mcc"] = training_mcc

training_confusion_matrix = confusion_matrix(
    training_clades,
    clusters
)