aspremon / NaiveFeatureSelection

Code for NaiveFeatureSelection, i.e. feature selection in Naive Bayes, see https://arxiv.org/abs/1905.09884
MIT License
9 stars 7 forks source link

How to add Laplace smoothie parameter to this code #5

Open Sandy4321 opened 4 years ago

Sandy4321 commented 4 years ago

How to add Laplace smoothie parameter to this code It is important to have Since sometimes there are values for features with small number of occurrences

By the way what will be for features with many moderate singificance values And for features with one meaningful values and several fully random values

Which feature in this case will be considered as important and Which not important Do you have synthetic data tests for binomial case? And for multinomial case With clear probabilistic dependency between feature values and without To see how your code correctly find predefined good / bad features? And maybe even good/bad feature values? It would be great add this performance example to paper...

arminaskari commented 4 years ago

The laplace smoothing parameter is the alpha parameter in the constructor in naivefeatureselection.py.

In terms of intuition for what the laplace smoothing parameter should be in different situations, we are not sure but it will definitely be a parameter that needs to be tuned for every specific application and we recommend to simply cross validate over this parameter.

Sandy4321 commented 4 years ago

referring to NaiveFeatureSelection call from example nfs = NaiveFeatureSelection(k=kv) may you share some code example with using all capabilities for NaiveFeatureSelection it would be great help to learn your code THANK YOU VERY MUCH IN ADVANCE

arminaskari commented 4 years ago

please see the updates

Sandy4321 commented 4 years ago

DemoBNFS.py is sample code for the sparse bernoulli naive bayes feature selection

Sandy4321 commented 4 years ago

Great thanks By the way what is about generic Bayesian classification for categorical data Not specific case of multinomial Bayesian for counts ? Scikit learn in last version 0.22 added such a classification, after many years having only multinomial baysian May you even need to add to paper title that you consider only multinomial case ...