COCA - Anomaly Detection

marciahon29 commented 8 months ago

Hello,

Please could you let me know how the Anomaly threshold is calculated? In your paper, you mention it is predefined. But how is it predefined?

Thanks

ruiking04 commented 8 months ago

You need to find this within a range yourself. For example, to find the percentile of (1, 100), the code is as follows:

scores = np.random.rand(1000)
labels = np.random.randint(0, 1, 1000)

def find_best_threshold(scores, labels):
    best_f1 = 0
    best_threshold = 0
    for detect_nu in np.arange(1, 100) / 1e2:
        threshold = np.percentile(scores, 100-detect_nu)
        predicted_labels = (scores > threshold).astype(int)
        f1 = f1_score(labels, predicted_labels)

        if f1 > best_f1:
            best_f1 = f1
            best_threshold = threshold

    return best_threshold, best_f1

best_threshold, best_f1 = find_best_threshold(scores, labels)

marciahon29 commented 8 months ago

I didn't do anything to find a threshold, I just used your code as is. Is this okay?

Also , which file has this code?

On Wed, Feb 14, 2024 at 7:37 PM RuiKing @.***> wrote:

You need to find this within a range yourself. For example, to find the percentile of (0, 100), the code is as follows:

scores = np.random.rand(1000) labels = np.random.randint(0, 1, 1000)

def find_best_threshold(scores, labels): best_f1 = 0 best_threshold = 0 for detect_nu in np.arange(1, 100) / 1e2: threshold = np.percentile(scores, 100-detect_nu) predicted_labels = (scores > threshold).astype(int) f1 = f1_score(labels, predicted_labels)
    if f1 > best_f1:
        best_f1 = f1
        best_threshold = threshold

return best_threshold, best_f1
best_threshold, best_f1 = find_best_threshold(scores, labels)

— Reply to this email directly, view it on GitHub https://github.com/ruiking04/COCA/issues/29#issuecomment-1945175797, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFOIJPTPTADQ3NJ23GSCZ7TYTVKENAVCNFSM6AAAAABDI7GATKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBVGE3TKNZZG4 . You are receiving this because you authored the thread.Message ID: @.***>

marciahon29 commented 8 months ago

Thank you. I have found the file that specifies this. It is in anomaly_predict.py with the function ad_predict.

ruiking04 / COCA

COCA - Anomaly Detection #29