kahramankostas / IoTDevIDv2

A Behavior-Based Device Identification Method for the IoT
MIT License
44 stars 12 forks source link

02.1 Feature importance voting and pre-assessment of features #4

Open KLFTESPACE opened 10 months ago

KLFTESPACE commented 10 months ago

I am sorry to trouble u that this part of code in file 02.1 can't run successfully. I referred to the solution in the issues and ran pip install git+https://github.com/kahramankostas/XuniVerse,Successfully installed contourpy-1.2.0 cycler-0.12.1 fonttools-4.44.3 joblib-1.3.2 kiwisolver-1.4.5 matplotlib-3.8.2 packaging-23.2 patsy-0.5.3 pillow-10.1.0 pyparsing-3.1.1 scikit-learn-1.3.2 scipy-1.11.4 statsmodels-0.14.0 threadpoolctl-3.2.0 xverse-1.0.5. but it did not take effect.

Hope for your early reply. Thanks! my pandas version is 2.1.3,

AttributeError Traceback (most recent call last) Cell In[22], line 14 12 clf = VotingSelector() 13 print(X, y) ---> 14 clf.fit(X, y) 15 #Selected features 16 temp="./results/"+i[18:-4]+"FI.csv"

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:224, in VotingSelector.fit(self, X, y) 222 #start training on the data 223 temp_X = X[self.use_features] --> 224 self.featureimportances, self.featurevotes = self.train(temp_X, y) 226 return self

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:285, in VotingSelector.train(self, X, y) 283 #handle categorical values with either 'woe' or 'le' 284 if self.handle_category == 'woe': --> 285 transformed_X, self.mapping, iv_df = self.woe_information_value(X, y) #woe transformed_X 286 elif self.handle_category == 'le': 287 transformed_X = X.copy(deep=True)

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:115, in VotingSelector.woe_information_value(self, X, y) 112 def woe_information_value(self, X, y): 114 clf = WOE() --> 115 clf.fit(X, y) 117 return clf.transform(X), clf.woe_bins, clf.iv_df

File D:\Python_env\Lib\site-packages\xverse\transformer_woe.py:137, in WOE.fit(self, X, y) 132 if self.monotonic_binning: 133 self.mono_bin_clf = MonotonicBinning(feature_names=self.mono_feature_names, 134 max_bins=self.mono_max_bins, force_bins=self.mono_force_bins, 135 cardinality_cutoff=self.mono_cardinality_cutoff, 136 prefix=self.mono_prefix, custom_binning=self.mono_custom_binning) --> 137 X = self.mono_bin_clf.fit_transform(X, y) 138 self.mono_custom_binning = self.mono_bin_clf.bins 140 #identify the variables to tranform and assign the bin mapping dictionary

File D:\Python_env\Lib\site-packages\sklearn\utils_set_output.py:157, in _wrap_method_output..wrapped(self, X, *args, kwargs) 155 @wraps(f) 156 def wrapped(self, X, *args, *kwargs): --> 157 data_to_wrap = f(self, X, args, kwargs) 158 if isinstance(data_to_wrap, tuple): 159 # only wrap the first output for cross decomposition 160 return_tuple = ( 161 _wrap_data_with_container(method, data_to_wrap[0], X, self), 162 *data_to_wrap[1:], 163 )

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:257, in MonotonicBinning.fit_transform(self, X, y) 256 def fit_transform(self, X, y): --> 257 return self.fit(X, y).transform(X)

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:122, in MonotonicBinning.fit(self, X, y) 118 raise ValueError("The input feature(s) should be numeric type. Some of the input features \ 119 has character values in it. Please use a encoder before performing monotonic operations.") 121 #apply the monotonic train function on dataset --> 122 fit_X.apply(lambda x: self.train(x, y), axis=0) 123 return self

File D:\Python_env\Lib\site-packages\pandas\core\frame.py:10034, in DataFrame.apply(self, func, axis, raw, result_type, args, by_row, **kwargs) 10022 from pandas.core.apply import frame_apply 10024 op = frame_apply( 10025 self, 10026 func=func, (...) 10032 kwargs=kwargs, 10033 )

10034 return op.apply().finalize(self, method="apply")

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:837, in FrameApply.apply(self) 834 elif self.raw: 835 return self.apply_raw() --> 837 return self.apply_standard()

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:963, in FrameApply.apply_standard(self) 962 def apply_standard(self): --> 963 results, res_index = self.apply_series_generator() 965 # wrap results 966 return self.wrap_results(results, res_index)

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:979, in FrameApply.apply_series_generator(self) 976 with option_context("mode.chained_assignment", None): 977 for i, v in enumerate(series_gen): 978 # ignore SettingWithCopy here in case the user mutates --> 979 results[i] = self.func(v, *self.args, **self.kwargs) 980 if isinstance(results[i], ABCSeries): 981 # If we have a view on v, we need to make a copy because 982 # series_generator will swap out the underlying data 983 results[i] = results[i].copy(deep=False)

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:122, in MonotonicBinning.fit..(x) 118 raise ValueError("The input feature(s) should be numeric type. Some of the input features \ 119 has character values in it. Please use a encoder before performing monotonic operations.") 121 #apply the monotonic train function on dataset --> 122 fit_X.apply(lambda x: self.train(x, y), axis=0) 123 return self

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:170, in MonotonicBinning.train(self, X, y) 165 """ 166 Execute this block when monotonic relationship is not identified by spearman technique. 167 We still want our code to produce bins. 168 """ 169 if len(bins_X_grouped) == 1: --> 170 bins = algos.quantile(X, np.linspace(0, 1, force_bins)) #creates a new binnning based on forced bins 171 if len(np.unique(bins)) == 2: 172 bins = np.insert(bins, 0, 1)

AttributeError: module 'pandas.core.algorithms' has no attribute 'quantile'

KLFTESPACE commented 10 months ago

I am sorry to trouble u that this part of code in file 02.1 can't run successfully. I referred to the solution in the issues and ran pip install git+https://github.com/kahramankostas/XuniVerse,Successfully installed contourpy-1.2.0 cycler-0.12.1 fonttools-4.44.3 joblib-1.3.2 kiwisolver-1.4.5 matplotlib-3.8.2 packaging-23.2 patsy-0.5.3 pillow-10.1.0 pyparsing-3.1.1 scikit-learn-1.3.2 scipy-1.11.4 statsmodels-0.14.0 threadpoolctl-3.2.0 xverse-1.0.5. but it did not take effect.

Hope for your early reply. Thanks!

my pandas version is 2.1.3, AttributeError Traceback (most recent call last) Cell In[22], line 14 12 clf = VotingSelector() 13 print(X, y) ---> 14 clf.fit(X, y) 15 #Selected features 16 temp="./results/"+i[18:-4]+"FI.csv"

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:224, in VotingSelector.fit(self, X, y) 222 #start training on the data 223 temp_X = X[self.use_features] --> 224 self.featureimportances, self.featurevotes = self.train(temp_X, y) 226 return self

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:285, in VotingSelector.train(self, X, y) 283 #handle categorical values with either 'woe' or 'le' 284 if self.handle_category == 'woe': --> 285 transformed_X, self.mapping, iv_df = self.woe_information_value(X, y) #woe transformed_X 286 elif self.handle_category == 'le': 287 transformed_X = X.copy(deep=True)

File D:\Python_env\Lib\site-packages\xverse\ensemble_voting.py:115, in VotingSelector.woe_information_value(self, X, y) 112 def woe_information_value(self, X, y): 114 clf = WOE() --> 115 clf.fit(X, y) 117 return clf.transform(X), clf.woe_bins, clf.iv_df

File D:\Python_env\Lib\site-packages\xverse\transformer_woe.py:137, in WOE.fit(self, X, y) 132 if self.monotonic_binning: 133 self.mono_bin_clf = MonotonicBinning(feature_names=self.mono_feature_names, 134 max_bins=self.mono_max_bins, force_bins=self.mono_force_bins, 135 cardinality_cutoff=self.mono_cardinality_cutoff, 136 prefix=self.mono_prefix, custom_binning=self.mono_custom_binning) --> 137 X = self.mono_bin_clf.fit_transform(X, y) 138 self.mono_custom_binning = self.mono_bin_clf.bins 140 #identify the variables to tranform and assign the bin mapping dictionary

File D:\Python_env\Lib\site-packages\sklearn\utils_set_output.py:157, in _wrap_method_output..wrapped(self, X, *args, kwargs) 155 @wraps(f) 156 def wrapped(self, X, *args, *kwargs): --> 157 data_to_wrap = f(self, X, args, kwargs) 158 if isinstance(data_to_wrap, tuple): 159 # only wrap the first output for cross decomposition 160 return_tuple = ( 161 _wrap_data_with_container(method, data_to_wrap[0], X, self), 162 *data_to_wrap[1:], 163 )

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:257, in MonotonicBinning.fit_transform(self, X, y) 256 def fit_transform(self, X, y): --> 257 return self.fit(X, y).transform(X)

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:122, in MonotonicBinning.fit(self, X, y) 118 raise ValueError("The input feature(s) should be numeric type. Some of the input features 119 has character values in it. Please use a encoder before performing monotonic operations.") 121 #apply the monotonic train function on dataset --> 122 fit_X.apply(lambda x: self.train(x, y), axis=0) 123 return self

File D:\Python_env\Lib\site-packages\pandas\core\frame.py:10034, in DataFrame.apply(self, func, axis, raw, result_type, args, by_row, **kwargs) 10022 from pandas.core.apply import frame_apply 10024 op = frame_apply( 10025 self, 10026 func=func, (...) 10032 kwargs=kwargs, 10033 )

10034 return op.apply().finalize(self, method="apply")

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:837, in FrameApply.apply(self) 834 elif self.raw: 835 return self.apply_raw() --> 837 return self.apply_standard()

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:963, in FrameApply.apply_standard(self) 962 def apply_standard(self): --> 963 results, res_index = self.apply_series_generator() 965 # wrap results 966 return self.wrap_results(results, res_index)

File D:\Python_env\Lib\site-packages\pandas\core\apply.py:979, in FrameApply.apply_series_generator(self) 976 with option_context("mode.chained_assignment", None): 977 for i, v in enumerate(series_gen): 978 # ignore SettingWithCopy here in case the user mutates --> 979 results[i] = self.func(v, *self.args, **self.kwargs) 980 if isinstance(results[i], ABCSeries): 981 # If we have a view on v, we need to make a copy because 982 # series_generator will swap out the underlying data 983 results[i] = results[i].copy(deep=False)

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:122, in MonotonicBinning.fit..(x) 118 raise ValueError("The input feature(s) should be numeric type. Some of the input features 119 has character values in it. Please use a encoder before performing monotonic operations.") 121 #apply the monotonic train function on dataset --> 122 fit_X.apply(lambda x: self.train(x, y), axis=0) 123 return self

File D:\Python_env\Lib\site-packages\xverse\transformer_binning.py:170, in MonotonicBinning.train(self, X, y) 165 """ 166 Execute this block when monotonic relationship is not identified by spearman technique. 167 We still want our code to produce bins. 168 """ 169 if len(bins_X_grouped) == 1: --> 170 bins = algos.quantile(X, np.linspace(0, 1, force_bins)) #creates a new binnning based on forced bins 171 if len(np.unique(bins)) == 2: 172 bins = np.insert(bins, 0, 1)

AttributeError: module 'pandas.core.algorithms' has no attribute 'quantile'

my python version is 3.11.0b5

kahramankostas commented 10 months ago

Thank you very much for your question. I'm sorry you're getting an error. I think this error is caused by Xverse's updates.

please make sure you don't have Xverse installed before. and make sure you install the IoTDevID version (git+https://github.com/kahramankostas/XuniVerse) when you reinstall this repository.

I recommend you to clean Xverse from your computer and reinstall it

KLFTESPACE commented 9 months ago

you install the IoTDevID version (git+https://github.com/kahramankostas/XuniVerse) when you reinstall this repository.

I recreated a Python 3.9 virtual environment and installed the required packages. In addition, I installed xverse using pip install git+https://github.com/kahramankostas/XuniVerse, but this time another error was reported.



KeyError Traceback (most recent call last) Cell In[27], line 14 12 clf = VotingSelector() 13 print(X, y) ---> 14 clf.fit(X, y) 15 #Selected features 16 temp="./results/"+i[18:-4]+"FI.csv"

File D:\projects\device_detect\IOT\lib\site-packages\xverse\ensemble_voting.py:224, in VotingSelector.fit(self, X, y) 222 #start training on the data 223 temp_X = X[self.use_features] --> 224 self.featureimportances, self.featurevotes = self.train(temp_X, y) 226 return self

File D:\projects\device_detect\IOT\lib\site-packages\xverse\ensemble_voting.py:285, in VotingSelector.train(self, X, y) 283 #handle categorical values with either 'woe' or 'le' 284 if self.handle_category == 'woe': --> 285 transformed_X, self.mapping, iv_df = self.woe_information_value(X, y) #woe transformed_X 286 elif self.handle_category == 'le': 287 transformed_X = X.copy(deep=True)

File D:\projects\device_detect\IOT\lib\site-packages\xverse\ensemble_voting.py:117, in VotingSelector.woe_information_value(self, X, y) 114 clf = WOE() 115 clf.fit(X, y) --> 117 return clf.transform(X), clf.woe_bins, clf.iv_df

File D:\projects\device_detect\IOT\lib\site-packages\sklearn\utils_set_output.py:157, in _wrap_method_output..wrapped(self, X, *args, kwargs) 155 @wraps(f) 156 def wrapped(self, X, *args, *kwargs): --> 157 data_to_wrap = f(self, X, args, kwargs) 158 if isinstance(data_to_wrap, tuple): 159 # only wrap the first output for cross decomposition 160 return_tuple = ( 161 _wrap_data_with_container(method, data_to_wrap[0], X, self), 162 *data_to_wrap[1:], 163 )

File D:\projects\device_detect\IOT\lib\site-packages\xverse\transformer_woe.py:310, in WOE.transform(self, X, y) 306 if not self.woe_bins: 307 raise ValueError("woe_bins variable is not present. \ 308 Estimator has to be fitted to apply transformations.") --> 310 outX[new_column_name] = tempX.replace(self.woe_bins[original_column_name]) 312 #transformed dataframe 313 return outX

KeyError: 'IP_MF'

kahramankostas commented 9 months ago

I tried it on windows and ubuntu. it works flawlessly on windows but gives this error on ubuntu. i can't figure out why. i will update the answer if i find a solution. For now you might consider running it on window or skip the Xverse step.

KLFTESPACE commented 9 months ago

I tried it on windows and ubuntu. it works flawlessly on windows but gives this error on ubuntu. i can't figure out why. i will update the answer if i find a solution. For now you might consider running it on window or skip the Xverse step.

actually,this error occurs on windows 11, and i And I don't know which step went wrong. Here are my steps: 1.pip install git+https://github.com/kahramankostas/XuniVerse ,Successfully installed contourpy-1.2.0 cycler-0.12.1 fonttools-4.45.0 importlib-resources-6.1.1 joblib-1.3.2 kiwisolver-1.4.5 matplotlib-3.8.2 numpy-1.26.2 packaging-23.2 pandas-2.1.3 patsy-0.5.3 pillow-10.1.0 pyparsing-3.1.1 python-dateutil-2.8.2 pytz-2023.3.post1 scikit-learn-1.3.2 scipy-1.11.4 six-1.16.0 statsmodels-0.14.0 threadpoolctl-3.2.0 tzdata-2023.3 xverse-1.0.5 zipp-3.17.0

  1. pip install seaborn
  2. pip install -U scapy I skipped the step "pip install graphviz" because I was experiencing errors when calling the ciz function, which prevented me from generating the graph. and here is all packages: Package Version

    anyio 4.0.0 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 2.4.1 async-lru 2.0.4 attrs 23.1.0 Babel 2.13.1 beautifulsoup4 4.12.2 bleach 6.1.0 certifi 2023.11.17 cffi 1.16.0 charset-normalizer 3.3.2 colorama 0.4.6 comm 0.2.0 contourpy 1.2.0 cycler 0.12.1 debugpy 1.8.0 decorator 5.1.1 defusedxml 0.7.1 exceptiongroup 1.1.3 executing 2.0.1 fastjsonschema 2.19.0 fonttools 4.45.0 fqdn 1.5.1 graphviz 0.20.1 idna 3.4 importlib-metadata 6.8.0 importlib-resources 6.1.1 ipykernel 6.26.0 ipython 8.17.2 ipywidgets 8.1.1 isoduration 20.11.0 jedi 0.19.1 Jinja2 3.1.2 joblib 1.3.2 json5 0.9.14 jsonpointer 2.4 jsonschema 4.20.0 jsonschema-specifications 2023.11.1 jupyter 1.0.0 jupyter_client 8.6.0 jupyter-console 6.6.3 jupyter_core 5.5.0 jupyter-events 0.9.0 jupyter-lsp 2.2.0 jupyter_server 2.10.1 jupyter_server_terminals 0.4.4 jupyterlab 4.0.9 jupyterlab-pygments 0.2.2 jupyterlab_server 2.25.2 jupyterlab-widgets 3.0.9 kiwisolver 1.4.5 MarkupSafe 2.1.3 matplotlib 3.8.2 matplotlib-inline 0.1.6 mistune 3.0.2 nbclient 0.9.0 nbconvert 7.11.0 nbformat 5.9.2 nest-asyncio 1.5.8 notebook 7.0.6 notebook_shim 0.2.3 numpy 1.26.2 overrides 7.4.0 packaging 23.2 pandas 2.1.3 pandocfilters 1.5.0 parso 0.8.3 patsy 0.5.3 Pillow 10.1.0 pip 22.0.4 platformdirs 4.0.0 prometheus-client 0.19.0 prompt-toolkit 3.0.41 psutil 5.9.6 pure-eval 0.2.2 pycparser 2.21 Pygments 2.17.1 pyparsing 3.1.1 python-dateutil 2.8.2 python-json-logger 2.0.7 pytz 2023.3.post1 pywin32 306 pywinpty 2.0.12 PyYAML 6.0.1 pyzmq 25.1.1 qtconsole 5.5.1 QtPy 2.4.1 referencing 0.31.0 requests 2.31.0 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rpds-py 0.13.1 scapy 2.5.0 scikit-learn 1.3.2 scipy 1.11.4 seaborn 0.13.0 Send2Trash 1.8.2 setuptools 58.1.0 six 1.16.0 sniffio 1.3.0 soupsieve 2.5 stack-data 0.6.3 statsmodels 0.14.0 terminado 0.18.0 threadpoolctl 3.2.0 tinycss2 1.2.1 tomli 2.0.1 tornado 6.3.3 traitlets 5.13.0 types-python-dateutil 2.8.19.14 typing_extensions 4.8.0 tzdata 2023.3 uri-template 1.3.0 urllib3 2.1.0 wcwidth 0.2.11 webcolors 1.13 webencodings 0.5.1 websocket-client 1.6.4 widgetsnbextension 4.0.9 xverse 1.0.5 zipp 3.17.0