mercedes-benz / automotive_feature_engineering

MIT License
4 stars 4 forks source link

#ERROR#! No important features could be found! #6

Open ibrahim-string opened 1 month ago

ibrahim-string commented 1 month ago

Hi, I am trying to run this piece of code :

# Import function
from automotive_feature_engineering import static

# Execute the static method
results = static(df_train, df_test, model = 'etree', target_names_list=['target'])

Here are the logs:

---------------------------------------------------------ERROR LOG------------------------------------------------------------------------

2024-09-26 21:00:54,339 - automotive_feature_engineering.main_feature_engineering - INFO - Alt Doku Path: [c:\Users\win10\OneDrive\Desktop\OpenSource\Automotive](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive) feauture engineering\os1\lib\site-packages
2024-09-26 21:00:54,342 - automotive_feature_engineering.main_feature_engineering - INFO - Feature Engineering shall only be performed on the training data. The test data will be transformed accordingly afterwards.
2024-09-26 21:00:57,545 - automotive_feature_engineering.main_feature_engineering - INFO - Total Time taken filter_unique_values: 2.3453756000089925 seconds
2024-09-26 21:01:41,634 - automotive_feature_engineering.main_feature_engineering - INFO - Total Time taken fill_nan: 44.05637849999766 seconds
2024-09-26 21:01:42,718 - automotive_feature_engineering.main_feature_engineering - INFO - Total Time taken filter_unique_values: 1.0835140999988653 seconds
2024-09-26 21:01:42,719 - automotive_feature_engineering.main_feature_engineering - INFO - Starting one-hot encoding...
2024-09-26 21:01:42,838 - automotive_feature_engineering.main_feature_engineering - INFO - Done fitting one-hot encoding
2024-09-26 21:01:46,398 - automotive_feature_engineering.main_feature_engineering - INFO - Debug: After transforming df with one-hot encoding
2024-09-26 21:01:46,454 - automotive_feature_engineering.main_feature_engineering - INFO - transform_one_hot_encodings: df: (1701260, 31) time taken 3.7335982999939006
2024-09-26 21:01:46,460 - automotive_feature_engineering.main_feature_engineering - INFO - Feature-Importance-Filter-0.0009999
---Calculating global Feature Importances---
Default path: [C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive) feauture engineering/os1/Lib/site-packages/automotive_feature_engineering/reinforcement_learning/rl_randomforest_defaults.json
{'n_estimators': 100, 'criterion': 'squared_error', 'max_depth': None, 'min_samples_split': 2, 'min_samples_leaf': 1, 'min_weight_fraction_leaf': 0.0, 'max_features': 'sqrt', 'max_leaf_nodes': 30, 'min_impurity_decrease': 0.0, 'bootstrap': True, 'n_jobs': -1, 'random_state': 42, 'warm_start': False, 'ccp_alpha': 0.0}
[c:\Users\win10\OneDrive\Desktop\OpenSource\Automotive](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive) feauture engineering\os1\lib\site-packages\automotive_feature_engineering\feature_selection.py:429: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples,), for example using ravel().
  regr.fit(feature_df, target_df)
2024-09-26 21:01:52,819 - automotive_feature_engineering.main_feature_engineering - INFO - Dropping 31 columns
---Global Feature Importance calculated for RandomForestRegressor---
---Global Feature Importance calculated---
Features and their corresponding importances: 
Number of features in data set: 31
All unimportant features dropped.
Number of features in data set: 0
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[20], [line 5](vscode-notebook-cell:?execution_count=20&line=5)
      [2](vscode-notebook-cell:?execution_count=20&line=2) from automotive_feature_engineering import static
      [4](vscode-notebook-cell:?execution_count=20&line=4) # Execute the static method
----> [5](vscode-notebook-cell:?execution_count=20&line=5) results = static(df_train, df_test, model = 'etree', target_names_list=['target'])

File c:\Users\win10\OneDrive\Desktop\OpenSource\Automotive feauture engineering\os1\lib\site-packages\automotive_feature_engineering\__init__.py:14, in static(df_train, df_test, model, target_names_list, **kwargs)
     [13](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/__init__.py:13) def static(df_train, df_test, model, target_names_list, **kwargs):
---> [14](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/__init__.py:14)     return run_main(
     [15](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/__init__.py:15)         df_train, df_test, model, target_names_list, method_list=None, **kwargs
     [16](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/__init__.py:16)     )

File c:\Users\win10\OneDrive\Desktop\OpenSource\Automotive feauture engineering\os1\lib\site-packages\automotive_feature_engineering\__init__.py:10, in run_main(df_train, df_test, model, target_names_list, method_list, **kwargs)
      [8](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/__init__.py:8) def run_main(df_train, df_test, model, target_names_list, method_list, **kwargs):
      [9](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/__init__.py:9)     feature = FeatureEngineering(df_train, df_test, model, target_names_list, **kwargs)
---> [10](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/__init__.py:10)     return feature.main(method_list)

File c:\Users\win10\OneDrive\Desktop\OpenSource\Automotive feauture engineering\os1\lib\site-packages\automotive_feature_engineering\main_feature_engineering.py:192, in FeatureEngineering.main(self, method_list)
    [190](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/main_feature_engineering.py:190) if number in function_dict:
    [191](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/main_feature_engineering.py:191)     function, parameters = function_dict[number]
--> [192](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/main_feature_engineering.py:192)     function(*parameters)
    [193](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/main_feature_engineering.py:193) else:
    [194](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/main_feature_engineering.py:194)     logger.warning(f"No function found for number {number}")
...
    [574](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/feature_selection.py:574) if len(df.columns) == 0:
--> [575](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/feature_selection.py:575)     raise ValueError("#ERROR#! No important features could be found!")
    [576](file:///C:/Users/win10/OneDrive/Desktop/OpenSource/Automotive%20feauture%20engineering/os1/lib/site-packages/automotive_feature_engineering/feature_selection.py:576) return df

ValueError: #ERROR#! No important features could be found!

---------------------------------------------------------ERROR LOG------------------------------------------------------------------------

I am using a cleaned dataset with no NaN.

This error message seems to be common among all the dataset I used.

For referece, here is the dataset I am using: Dataset

Here is little about dataset: This dataset simulates in-vehicle communication on a Controller Area Network (CAN) bus and contains both normal and attack data. It includes four main categories:

Attack-Free State: Normal CAN messages representing typical in-vehicle communication. DoS Attack: Injection of high-frequency CAN ID 0x000 messages to disrupt communication. Fuzzy Attack: Injection of random, spoofed CAN IDs and data values. Impersonation Attack: Injection of messages impersonating a legitimate node, specifically with CAN ID 0x164. The dataset has been preprocessed into four CSV files, each representing one of the attack types or normal data, and includes balanced (SMOTE oversampled) and unbalanced versions. The target labels indicate different attack types, and the dataset is designed for intrusion detection in CAN networks.

ibrahim-string commented 1 month ago

Hi @mueller-mb @jonaswa11 , I am sure your busy with your work, I just wanted to know if I can I self assign this issue to myself ?