geneura-papers / 2017-GPRuleRefinement

Repository for the GPRuleRefinement paper to be sent to a Journal.
Artistic License 2.0
0 stars 0 forks source link

Add table with variables classification #29

Closed unintendedbear closed 7 years ago

unintendedbear commented 7 years ago

Also, include a * to those BYOD-specific.

unintendedbear commented 7 years ago

Problem found @fergunet @JJ : we need to justify why we have used only those variables, 18 out of 39 (37 if we don't count the username and the asset_name, or the app_name).

JJ commented 7 years ago

Why have you used only those variables?

unintendedbear commented 7 years ago

No particular reason. If I run Ranker+InfoGainAttributeEval in Weka:

=== Attribute Selection on all input data ===

Search Method: Attribute ranking.

Attribute Evaluator (supervised, Class (nominal): 39 label): Information Gain Ranking Filter

Ranked attributes: 0.38366533 10 user_trust_value 0.38194309 18 device_trust_value 0.27674493 3 event_type 0.2016991 1 decision_cause 0.11060212 30 asset_location 0.07917374 5 username 0.0613105 13 event_detection 0.01640996 9 passwd_has_capital_letters 0.01499682 15 device_OS 0.01384919 23 device_is_rooted 0.01208606 14 device_type 0.01199317 21 device_screen_timeout 0.01186307 36 wifiEnabled 0.01118785 20 device_has_password 0.0092601 8 numbers_in_password 0.00820572 35 wifiEncryption 0.00779816 37 wifiConnected 0.00731 7 letters_in_password 0.00680391 12 user_role 0.00631848 22 device_has_accessibility 0.00254793 19 device_owned_by 0.00200217 16 device_has_antivirus 0.00148272 6 password_length 0.00142113 2 silent_mode 0.00132244 38 bluetoothConnected 0.00015059 11 activated_account 0.00003763 29 asset_confidential_level 0.0000047 4 event_level 0 34 mail_has_attachment 0 27 asset_name 0 33 mail_contains_bcc_allowed 0 24 app_name 0 25 app_vendor 0 28 asset_value 0 26 app_is_MUSES_aware 0 31 mail_recipient_allowed 0 32 mail_contains_cc_allowed 0 17 device_has_certificate

JJ commented 7 years ago

If you used feature selection algorithm, say so in the paper. If you used some heuristic rule to extract only 18 variables, say so. There must be a reason, you can't just extract 18 variables randomly or on a whim.

unintendedbear commented 7 years ago

ok, will take care of it