What is the representation of oversample_dict and undersample_dict?

manwu1994 commented 1 year ago

Hello. Thank you so much for your interesting work and for your sharing. When the split train, valid, test and save to MongoDB, we noticed the "'infiltration': 10", what is the representation of it?
Is it represents label mapping?

""" final_data(oversample_dict={'infiltration': 10, 'bruteforce-web': 10, 'bruteforce-xss': 20, 'sql-injection': 40}, undersample_dict={'ddos-hoic': 4}, new_db='mixed_613', raw_db='PacketInString') """

Thank you so much in advance for your answer.

sspku-2021 commented 1 year ago

Upsampling ratio

发自我的iPhone

------------------ Original ------------------ From: Wu @.> Date: Fri,Oct 28,2022 3:47 PM To: sspku-2021/PBCNN @.> Cc: Subscribed @.***> Subject: Re: [sspku-2021/PBCNN] What is the representation of oversample_dict andundersample_dict? (Issue #4)

Hello. Thank you so much for your interesting work and for your sharing. When the split train, valid, test and save to MongoDB, we noticed the "'infiltration': 10", what is the representation of it? Is it represents label mapping?

""" final_data(oversample_dict={'infiltration': 10, 'bruteforce-web': 10, 'bruteforce-xss': 20, 'sql-injection': 40}, undersample_dict={'ddos-hoic': 4}, new_db='mixed_613', raw_db='PacketInString') """

Thank you so much in advance for your answer.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

manwu1994 commented 1 year ago

Thank you for getting back to me. Can I continue to ask another question: In tools2,

for path in pcap_path:
        with PcapReader(path) as pr:
            for pkt in pr:
                # if inet.IP not in pkt.layers():
                if not pkt.haslayer('IP'):
                    continue
                bid = get_biflow_id(pkt)
                if isinstance(bid, Exception):
                    pro_wr += 1
                    continue
                if cnt % 100 == 0:
                    fw.flush()
                biflow_id, protocol = bid
                pkt_str = pkt_to_str(pkt)

                if biflow_id in flows_maps:
                    cur_biflow = flows_maps[biflow_id]
                    last_seen_time = cur_biflow['last_seen_time']

Herein, does the flows_maps is empty dict? If yes, the code will not execute this one

if biflow_id in flows_maps:
                    cur_biflow = flows_maps[biflow_id]
                    last_seen_time = cur_biflow['last_seen_time']

I look forward to your discussion. Thank you so much in advance for your answer.

sspku-2021 / PBCNN

What is the representation of oversample_dict and undersample_dict? #4