Open FAMILIAR-project opened 5 years ago
another use case where I need the list: commit 27fb80b
(I wanted to compute the frequency of some options in the dataset, and some options have been removed !? is it due to the removal of options that have unique values?)
Added a notebook to explore this and the file with all non tristate options and their possible value in commit 8eadd1c
@HugoJPMartin thanks! can you push here or in https://github.com/TuxML/tuxml-datasets the scripts you used for encoding the data? I need it for encoding data of 4.15 (see #13)
There are a few options (~100) that are neither boolean nor tristate (numerical or strings). We choose to remove them, which is reasonable due to the number of features we already have. Yet we may have missed an opportunity.
@HugoJPMartin can you give the precise list of options that we remove in the first place?