ahlashkari / NTLFlowLyzer

GNU General Public License v3.0
39 stars 10 forks source link

"features_ignore_list" is not working #18

Closed c0depirate69 closed 6 months ago

c0depirate69 commented 6 months ago

This is my config file { "pcap_file_address": "/home/kali/Desktop/Project/out.pcap", "output_file_address": "/home/kali/Desktop/Project/output.csv", "feature_extractor_min_flows": 2500, "writer_min_rows": 1000, "read_packets_count_value_log_info": 1000000, "check_flows_ending_min_flows": 20000, "capturer_updating_flows_min_value": 5000, "max_flow_duration": 120000, "activity_timeout": 300, "floating_point_unit": ".4f", "max_rows_number": 800000, "features_ignore_list": ["flow_id", "src_ip", "src_port", "dst_ip", "protocol", "timestamp", "PacketsCount", "TotalPayloadBytes", "PayloadBytesMedian", "PayloadBytesSkewness", "PayloadBytesCov", "PayloadBytesMode", "FwdPayloadBytesVariance", "FwdPayloadBytesMedian", "FwdPayloadBytesSkewness", "FwdPayloadBytesCov", "FwdPayloadBytesMode", "BwdPayloadBytesVariance", "BwdPayloadBytesMedian", "BwdPayloadBytesSkewness", "BwdPayloadBytesCov", "BwdPayloadBytesMode", "TotalHeaderBytes", "MaxHeaderBytes", "MinHeaderBytes", "MeanHeaderBytes", "StdHeaderBytes", "MedianHeaderBytes", "SkewnessHeaderBytes", "CoVHeaderBytes", "ModeHeaderBytes", "VarianceHeaderBytes", "FwdMaxHeaderBytes", "FwdMinHeaderBytes", "FwdStdHeaderBytes", "FwdMedianHeaderBytes", "FwdSkewnessHeaderBytes", "FwdCoVHeaderBytes", "FwdModeHeaderBytes", "FwdVarianceHeaderBytes", "BwdMaxHeaderBytes", "BwdMinHeaderBytes", "BwdMeanHeaderBytes", "BwdStdHeaderBytes", "BwdMedianHeaderBytes", "BwdSkewnessHeaderBytes", "BwdCoVHeaderBytes", "BwdModeHeaderBytes", "BwdVarianceHeaderBytes", "FwdSegmentSizeMax", "FwdSegmentSizeStd", "FwdSegmentSizeVariance", "FwdSegmentSizeMedian", "FwdSegmentSizeSkewness", "FwdSegmentSizeCov", "FwdSegmentSizeMode", "BwdSegmentSizeMax", "BwdSegmentSizeMin", "BwdSegmentSizeStd", "BwdSegmentSizeVariance", "BwdSegmentSizeMedian", "BwdSegmentSizeSkewness", "BwdSegmentSizeCov", "BwdSegmentSizeMode", "SegmentSizeMean", "SegmentSizeMax", "SegmentSizeMin", "SegmentSizeStd", "SegmentSizeVariance", "SegmentSizeMedian", "SegmentSizeSkewness", "SegmentSizeCov", "SegmentSizeMode", "FwdInitWinBytes", "BwdInitWinBytes", "ActiveMin", "ActiveMax", "ActiveMean", "ActiveStd", "ActiveMedian", "ActiveSkewness", "ActiveCoV", "ActiveMode", "ActiveVariance", "IdleMin", "IdleMax", "IdleMean", "IdleStd", "IdleMedian", "IdleSkewness", "IdleCoV", "IdleMode", "IdleVariance", "FwdBytesRate", "BwdBytesRate", "BwdPacketsRate", "FwdBulkStateCount", "FwdBulkSizeTotal", "FwdBulkPacketCount", "FwdBulkDuration", "BwdBulkStateCount", "BwdBulkSizeTotal", "BwdBulkPacketCount", "BwdBulkDuration", "FwdFINFlagCounts", "FwdECEFlagCounts", "FwdSYNFlagCounts", "FwdACKFlagCounts", "FwdCWRFlagCounts", "FwdRSTFlagCounts", "BwdFINFlagCounts", "BwdPSHFlagCounts", "BwdURGFlagCounts", "BwdECEFlagCounts", "BwdSYNFlagCounts", "BwdACKFlagCounts", "BwdCWRFlagCounts", "BwdRSTFlagCounts", "FINFlagPercentageInTotal", "PSHFlagPercentageInTotal", "URGFlagPercentageInTotal", "ECEFlagPercentageInTotal", "SYNFlagPercentageInTotal", "ACKFlagPercentageInTotal", "CWRFlagPercentageInTotal", "RSTFlagPercentageInTotal", "FwdFINFlagPercentageInTotal", "FwdPSHFlagPercentageInTotal", "FwdURGFlagPercentageInTotal", "FwdECEFlagPercentageInTotal", "FwdSYNFlagPercentageInTotal", "FwdACKFlagPercentageInTotal", "FwdCWRFlagPercentageInTotal", "FwdRSTFlagPercentageInTotal", "BwdFINFlagPercentageInTotal", "BwdPSHFlagPercentageInTotal", "BwdURGFlagPercentageInTotal", "BwdECEFlagPercentageInTotal", "BwdSYNFlagPercentageInTotal", "BwdACKFlagPercentageInTotal", "BwdCWRFlagPercentageInTotal", "BwdRSTFlagPercentageInTotal", "FwdFINFlagPercentageInFwdPackets", "FwdPSHFlagPercentageInFwdPackets", "FwdURGFlagPercentageInFwdPackets", "FwdECEFlagPercentageInFwdPackets", "FwdSYNFlagPercentageInFwdPackets", "FwdACKFlagPercentageInFwdPackets", "FwdCWRFlagPercentageInFwdPackets", "FwdRSTFlagPercentageInFwdPackets", "BwdFINFlagPercentageInBwdPackets", "BwdPSHFlagPercentageInBwdPackets", "BwdURGFlagPercentageInBwdPackets", "BwdECEFlagPercentageInBwdPackets", "BwdSYNFlagPercentageInBwdPackets", "BwdACKFlagPercentageInBwdPackets", "BwdCWRFlagPercentageInBwdPackets", "BwdRSTFlagPercentageInBwdPackets", "PacketsIATMedian", "PacketsIATSkewness", "PacketsIATCoV", "PacketsIATMode", "PacketsIATVariance", "FwdPacketsIATMedian", "FwdPacketsIATSkewness", "FwdPacketsIATCoV", "FwdPacketsIATMode", "FwdPacketsIATVariance", "BwdPacketsIATMedian", "BwdPacketsIATSkewness", "BwdPacketsIATCoV", "BwdPacketsIATMode", "BwdPacketsIATVariance", "DeltaStart", "HandshakeDuration", "HandshakeState", "PacketsDeltaTimeMin", "PacketsDeltaTimeMax", "PacketsDeltaTimeMean", "PacketsDeltaTimeMode", "PacketsDeltaTimeVariance", "PacketsDeltaTimeStd", "PacketsDeltaTimeMedian", "PacketsDeltaTimeSkewness", "PacketsDeltaTimeCoV", "BwdPacketsDeltaTimeMin", "BwdPacketsDeltaTimeMax", "BwdPacketsDeltaTimeMean", "BwdPacketsDeltaTimeMode", "BwdPacketsDeltaTimeVariance", "BwdPacketsDeltaTimeStd", "BwdPacketsDeltaTimeMedian", "BwdPacketsDeltaTimeSkewness", "BwdPacketsDeltaTimeCoV", "FwdPacketsDeltaTimeMin", "FwdPacketsDeltaTimeMax", "FwdPacketsDeltaTimeMean", "FwdPacketsDeltaTimeMode", "FwdPacketsDeltaTimeVariance", "FwdPacketsDeltaTimeStd", "FwdPacketsDeltaTimeMedian", "FwdPacketsDeltaTimeSkewness", "FwdPacketsDeltaTimeCoV", "PacketsDeltaLenMin", "PacketsDeltaLenMax", "PacketsDeltaLenMean", "PacketsDeltaLenMode", "PacketsDeltaLenVariance", "PacketsDeltaLenStd", "PacketsDeltaLenMedian", "PacketsDeltaLenSkewness", "PacketsDeltaLenCoV", "BwdPacketsDeltaLenMin", "BwdPacketsDeltaLenMax", "BwdPacketsDeltaLenMean", "BwdPacketsDeltaLenMode", "BwdPacketsDeltaLenVariance", "BwdPacketsDeltaLenStd", "BwdPacketsDeltaLenMedian", "BwdPacketsDeltaLenSkewness", "BwdPacketsDeltaLenCoV", "FwdPacketsDeltaLenMin", "FwdPacketsDeltaLenMax", "FwdPacketsDeltaLenMean", "FwdPacketsDeltaLenMode", "FwdPacketsDeltaLenVariance", "FwdPacketsDeltaLenStd", "FwdPacketsDeltaLenMedian", "FwdPacketsDeltaLenSkewness", "FwdPacketsDeltaLenCoV", "HeaderBytesDeltaLenMin", "HeaderBytesDeltaLenMax", "HeaderBytesDeltaLenMean", "HeaderBytesDeltaLenMode", "HeaderBytesDeltaLenVariance", "HeaderBytesDeltaLenStd", "HeaderBytesDeltaLenMedian", "HeaderBytesDeltaLenSkewness", "HeaderBytesDeltaLenCoV", "BwdHeaderBytesDeltaLenMin", "BwdHeaderBytesDeltaLenMax", "BwdHeaderBytesDeltaLenMean", "BwdHeaderBytesDeltaLenMode", "BwdHeaderBytesDeltaLenVariance", "BwdHeaderBytesDeltaLenStd", "BwdHeaderBytesDeltaLenMedian", "BwdHeaderBytesDeltaLenSkewness", "BwdHeaderBytesDeltaLenCoV", "FwdHeaderBytesDeltaLenMin", "FwdHeaderBytesDeltaLenMax", "FwdHeaderBytesDeltaLenMean", "FwdHeaderBytesDeltaLenMode", "FwdHeaderBytesDeltaLenVariance", "FwdHeaderBytesDeltaLenStd", "FwdHeaderBytesDeltaLenMedian", "FwdHeaderBytesDeltaLenSkewness", "FwdHeaderBytesDeltaLenCoV", "PayloadBytesDeltaLenMin", "PayloadBytesDeltaLenMax", "PayloadBytesDeltaLenMean", "PayloadBytesDeltaLenMode", "PayloadBytesDeltaLenVariance", "PayloadBytesDeltaLenStd", "PayloadBytesDeltaLenMedian", "PayloadBytesDeltaLenSkewness", "PayloadBytesDeltaLenCoV", "BwdPayloadBytesDeltaLenMin", "BwdPayloadBytesDeltaLenMax", "BwdPayloadBytesDeltaLenMean", "BwdPayloadBytesDeltaLenMode", "BwdPayloadBytesDeltaLenVariance", "BwdPayloadBytesDeltaLenStd", "BwdPayloadBytesDeltaLenMedian", "BwdPayloadBytesDeltaLenSkewness", "BwdPayloadBytesDeltaLenCoV", "FwdPayloadBytesDeltaLenMin", "FwdPayloadBytesDeltaLenMax", "FwdPayloadBytesDeltaLenMean", "FwdPayloadBytesDeltaLenMode", "FwdPayloadBytesDeltaLenVariance", "FwdPayloadBytesDeltaLenStd", "FwdPayloadBytesDeltaLenMedian", "FwdPayloadBytesDeltaLenSkewness", "FwdPayloadBytesDeltaLenCoV"] }

but when i run sudo ntlflowlyzer -c config i still get all these features in the csv file image

moein-shafi commented 6 months ago

Hi @c0depirate69,

Thank you for reaching out and for your engagement with NTLFlowLyzer.

Upon reviewing the configuration file you referenced, it appears there is an issue with the format for the features ignore list. In this list, you should include the feature names rather than the feature class names. For example, instead of using PacketsCount or TotalPayloadBytes, you should use packets_count or total_payload_bytes. For flow_id, src_ip, src_port, dst_ip, and protocol, your entries are correct.

To find the correct feature names, you can check the related class for that feature (or just easily convert the class name from CamelCase to snake_case). For example, for the TotalPayloadBytes feature, you can see the feature name in the following screenshot:

image

I hope this resolves the issue. Please do not hesitate to reach out if you encounter any further issues or have any other questions. We greatly appreciate your interest in NTLFlowLyzer and are here to assist you.

c0depirate69 commented 6 months ago

i used a code that converts CamelCase to snake_case and here is my new config

{ "pcap_file_address": "/home/kali/Desktop/Project/out.pcap", "output_file_address": "/home/kali/Desktop/Project/output.csv", "feature_extractor_min_flows": 2500, "writer_min_rows": 1000, "read_packets_count_value_log_info": 1000000, "check_flows_ending_min_flows": 20000, "capturer_updating_flows_min_value": 5000, "max_flow_duration": 120000, "activity_timeout": 300, "floating_point_unit": ".4f", "max_rows_number": 800000, "features_ignore_list": ["flow_id", "src_ip", "src_port", "dst_ip", "protocol", "timestamp", "packets_count", "total_payload_bytes", "payload_bytes_median", "payload_bytes_skewness", "payload_bytes_cov", "payload_bytes_mode", "fwd_payload_bytes_variance", "fwd_payload_bytes_median", "fwd_payload_bytes_skewness", "fwd_payload_bytes_cov", "fwd_payload_bytes_mode", "bwd_payload_bytes_variance", "bwd_payload_bytes_median", "bwd_payload_bytes_skewness", "bwd_payload_bytes_cov", "bwd_payload_bytes_mode", "total_header_bytes", "max_header_bytes", "min_header_bytes", "mean_header_bytes", "std_header_bytes", "median_header_bytes", "skewness_header_bytes", "cov_header_bytes", "mode_header_bytes", "variance_header_bytes", "fwd_max_header_bytes", "fwd_min_header_bytes", "fwd_std_header_bytes", "fwd_median_header_bytes", "fwd_skewness_header_bytes", "fwd_cov_header_bytes", "fwd_mode_header_bytes", "fwd_variance_header_bytes", "bwd_max_header_bytes", "bwd_min_header_bytes", "bwd_mean_header_bytes", "bwd_std_header_bytes", "bwd_median_header_bytes", "bwd_skewness_header_bytes", "bwd_cov_header_bytes", "bwd_mode_header_bytes", "bwd_variance_header_bytes", "fwd_segment_size_max", "fwd_segment_size_std", "fwd_segment_size_variance", "fwd_segment_size_median", "fwd_segment_size_skewness", "fwd_segment_size_cov", "fwd_segment_size_mode", "bwd_segment_size_max", "bwd_segment_size_min", "bwd_segment_size_std", "bwd_segment_size_variance", "bwd_segment_size_median", "bwd_segment_size_skewness", "bwd_segment_size_cov", "bwd_segment_size_mode", "segment_size_mean", "segment_size_max", "segment_size_min", "segment_size_std", "segment_size_variance", "segment_size_median", "segment_size_skewness", "segment_size_cov", "segment_size_mode", "fwd_init_win_bytes", "bwd_init_win_bytes", "active_min", "active_max", "active_mean", "active_std", "active_median", "active_skewness", "active_cov", "active_mode", "active_variance", "idle_min", "idle_max", "idle_mean", "idle_std", "idle_median", "idle_skewness", "idle_cov", "idle_mode", "idle_variance", "fwd_bytes_rate", "bwd_bytes_rate", "bwd_packets_rate", "fwd_bulk_state_count", "fwd_bulk_size_total", "fwd_bulk_packet_count", "fwd_bulk_duration", "bwd_bulk_state_count", "bwd_bulk_size_total", "bwd_bulk_packet_count", "bwd_bulk_duration", "fwd_fin_flag_counts", "fwd_ece_flag_counts", "fwd_syn_flag_counts", "fwd_ack_flag_counts", "fwd_cwr_flag_counts", "fwd_rst_flag_counts", "bwd_fin_flag_counts", "bwd_psh_flag_counts", "bwd_urg_flag_counts", "bwd_ece_flag_counts", "bwd_syn_flag_counts", "bwd_ack_flag_counts", "bwd_cwr_flag_counts", "bwd_rst_flag_counts", "fin_flag_percentage_in_total", "psh_flag_percentage_in_total", "urg_flag_percentage_in_total", "ece_flag_percentage_in_total", "syn_flag_percentage_in_total", "ack_flag_percentage_in_total", "cwr_flag_percentage_in_total", "rst_flag_percentage_in_total", "fwd_fin_flag_percentage_in_total", "fwd_psh_flag_percentage_in_total", "fwd_urg_flag_percentage_in_total", "fwd_ece_flag_percentage_in_total", "fwd_syn_flag_percentage_in_total", "fwd_ack_flag_percentage_in_total", "fwd_cwr_flag_percentage_in_total", "fwd_rst_flag_percentage_in_total", "bwd_fin_flag_percentage_in_total", "bwd_psh_flag_percentage_in_total", "bwd_urg_flag_percentage_in_total", "bwd_ece_flag_percentage_in_total", "bwd_syn_flag_percentage_in_total", "bwd_ack_flag_percentage_in_total", "bwd_cwr_flag_percentage_in_total", "bwd_rst_flag_percentage_in_total", "fwd_fin_flag_percentage_in_fwd_packets", "fwd_psh_flag_percentage_in_fwd_packets", "fwd_urg_flag_percentage_in_fwd_packets", "fwd_ece_flag_percentage_in_fwd_packets", "fwd_syn_flag_percentage_in_fwd_packets", "fwd_ack_flag_percentage_in_fwd_packets", "fwd_cwr_flag_percentage_in_fwd_packets", "fwd_rst_flag_percentage_in_fwd_packets", "bwd_fin_flag_percentage_in_bwd_packets", "bwd_psh_flag_percentage_in_bwd_packets", "bwd_urg_flag_percentage_in_bwd_packets", "bwd_ece_flag_percentage_in_bwd_packets", "bwd_syn_flag_percentage_in_bwd_packets", "bwd_ack_flag_percentage_in_bwd_packets", "bwd_cwr_flag_percentage_in_bwd_packets", "bwd_rst_flag_percentage_in_bwd_packets", "packets_iat_median", "packets_iat_skewness", "packets_iat_cov", "packets_iat_mode", "packets_iat_variance", "fwd_packets_iat_median", "fwd_packets_iat_skewness", "fwd_packets_iat_cov", "fwd_packets_iat_mode", "fwd_packets_iat_variance", "bwd_packets_iat_median", "bwd_packets_iat_skewness", "bwd_packets_iat_cov", "bwd_packets_iat_mode", "bwd_packets_iat_variance", "delta_start", "handshake_duration", "handshake_state", "packets_delta_time_min", "packets_delta_time_max", "packets_delta_time_mean", "packets_delta_time_mode", "packets_delta_time_variance", "packets_delta_time_std", "packets_delta_time_median", "packets_delta_time_skewness", "packets_delta_time_cov", "bwd_packets_delta_time_min", "bwd_packets_delta_time_max", "bwd_packets_delta_time_mean", "bwd_packets_delta_time_mode", "bwd_packets_delta_time_variance", "bwd_packets_delta_time_std", "bwd_packets_delta_time_median", "bwd_packets_delta_time_skewness", "bwd_packets_delta_time_cov", "fwd_packets_delta_time_min", "fwd_packets_delta_time_max", "fwd_packets_delta_time_mean", "fwd_packets_delta_time_mode", "fwd_packets_delta_time_variance", "fwd_packets_delta_time_std", "fwd_packets_delta_time_median", "fwd_packets_delta_time_skewness", "fwd_packets_delta_time_cov", "packets_delta_len_min", "packets_delta_len_max", "packets_delta_len_mean", "packets_delta_len_mode", "packets_delta_len_variance", "packets_delta_len_std", "packets_delta_len_median", "packets_delta_len_skewness", "packets_delta_len_cov", "bwd_packets_delta_len_min", "bwd_packets_delta_len_max", "bwd_packets_delta_len_mean", "bwd_packets_delta_len_mode", "bwd_packets_delta_len_variance", "bwd_packets_delta_len_std", "bwd_packets_delta_len_median", "bwd_packets_delta_len_skewness", "bwd_packets_delta_len_cov", "fwd_packets_delta_len_min", "fwd_packets_delta_len_max", "fwd_packets_delta_len_mean", "fwd_packets_delta_len_mode", "fwd_packets_delta_len_variance", "fwd_packets_delta_len_std", "fwd_packets_delta_len_median", "fwd_packets_delta_len_skewness", "fwd_packets_delta_len_cov", "header_bytes_delta_len_min", "header_bytes_delta_len_max", "header_bytes_delta_len_mean", "header_bytes_delta_len_mode", "header_bytes_delta_len_variance", "header_bytes_delta_len_std", "header_bytes_delta_len_median", "header_bytes_delta_len_skewness", "header_bytes_delta_len_cov", "bwd_header_bytes_delta_len_min", "bwd_header_bytes_delta_len_max", "bwd_header_bytes_delta_len_mean", "bwd_header_bytes_delta_len_mode", "bwd_header_bytes_delta_len_variance", "bwd_header_bytes_delta_len_std", "bwd_header_bytes_delta_len_median", "bwd_header_bytes_delta_len_skewness", "bwd_header_bytes_delta_len_cov", "fwd_header_bytes_delta_len_min", "fwd_header_bytes_delta_len_max", "fwd_header_bytes_delta_len_mean", "fwd_header_bytes_delta_len_mode", "fwd_header_bytes_delta_len_variance", "fwd_header_bytes_delta_len_std", "fwd_header_bytes_delta_len_median", "fwd_header_bytes_delta_len_skewness", "fwd_header_bytes_delta_len_cov", "payload_bytes_delta_len_min", "payload_bytes_delta_len_max", "payload_bytes_delta_len_mean", "payload_bytes_delta_len_mode", "payload_bytes_delta_len_variance", "payload_bytes_delta_len_std", "payload_bytes_delta_len_median", "payload_bytes_delta_len_skewness", "payload_bytes_delta_len_cov", "bwd_payload_bytes_delta_len_min", "bwd_payload_bytes_delta_len_max", "bwd_payload_bytes_delta_len_mean", "bwd_payload_bytes_delta_len_mode", "bwd_payload_bytes_delta_len_variance", "bwd_payload_bytes_delta_len_std", "bwd_payload_bytes_delta_len_median", "bwd_payload_bytes_delta_len_skewness", "bwd_payload_bytes_delta_len_cov", "fwd_payload_bytes_delta_len_min", "fwd_payload_bytes_delta_len_max", "fwd_payload_bytes_delta_len_mean", "fwd_payload_bytes_delta_len_mode", "fwd_payload_bytes_delta_len_variance", "fwd_payload_bytes_delta_len_std", "fwd_payload_bytes_delta_len_median", "fwd_payload_bytes_delta_len_skewness", "fwd_payload_bytes_delta_len_cov"] } but still the same problem new output image

moein-shafi commented 6 months ago

Hi @c0depirate69,

Thank you for reaching out.

I wanted to address two main points:

  1. In this version, it is not possible to ignore the primary tuples, which are flow_id, src_ip, src_port, dst_ip, dst_port, timestamp, and protocol (We will update the code to support this). However, you can manually comment out the following lines in the feature_extractor.py file:

image

  1. Additionally, some of the feature names you used are still incorrect. For example, instead of bwd_header_bytes_delta_len_variance, you should use variance_bwd_header_bytes_delta_len (or packets_IAT_mode instead of packets_iat_mode). Please refer to the name variable value in the class definition to get the correct feature names. It appears that not all feature names are simply the snake_case version of their class names (apologies for the oversight in my previous comment).

I hope this helps resolve the issues. If you have any further questions or need additional assistance, please feel free to reach out. We appreciate your interest in NTLFlowLyzer and are here to support you.