azmfaridee / mothur

This is GSoC2012 fork of 'Mothur'. We are trying to implement a number of 'Feature Selection' algorithms for microbial ecology data and incorporate them into mother's main codebase.
https://github.com/mothur/mothur
GNU General Public License v3.0
3 stars 1 forks source link

Fix Output From Multi-Class Datasets #30

Open kdiverson opened 11 years ago

kdiverson commented 11 years ago

The output summary file doesn't divide up the features by class.

example output from multi-class design file:

OTU Rank
Otu0026 0.6
Otu0067 0.31
Otu0092 0.3
Otu0024 0.3
Otu0007 0.29
Otu0033 0.26
Otu0027 0.24
Otu0030 0.22
Otu0044 0.22
Otu0072 0.2
Otu0071 0.2
Otu0015 0.19
Otu0008 0.19
Otu0038 0.18
Otu0042 0.17
Otu0001 0.17
Otu0085 0.16
Otu0079 0.14
Otu0043 0.14
Otu0019 0.14
Otu0039 0.14
Otu0035 0.14
Otu0002 0.14
Otu0066 0.12
Otu0099 0.12
Otu0120 0.12
Otu0003 0.12
Otu0031 0.11
Otu0087 0.11
Otu0111 0.1
Otu0131 0.1
Otu0088 0.1
Otu0189 0.1
Otu0050 0.1
azmfaridee commented 11 years ago

@kdiverson I'm noticing that this issue is still open, I guess I forgot what you told me about the issue. Was this fixed? If not can you give me a little bit recap of it, what's the exact issue and what you'd like an example output to look like?

kdiverson commented 11 years ago

I think this was for showing feature importance per class. Example from R:

head(ent.rf$importance)
                      Partition_1  Partition_2   Partition_3 MeanDecreaseAccuracy MeanDecreaseGini
Otu02439           0 0.000000e+00  0.000000e+00         0.000000e+00      0.000000000
Otu02449           0 0.000000e+00  0.000000e+00         0.000000e+00      0.005566667
Otu02509           0 2.666667e-05  0.000000e+00         8.771930e-06      0.035270498
Otu02510           0 0.000000e+00 -2.469136e-05        -9.345794e-06      0.113034105
Otu02512           0 0.000000e+00  0.000000e+00         0.000000e+00      0.002666667
Otu02513           0 0.000000e+00  0.000000e+00         0.000000e+00      0.020639278