cmaclell / concept_formation

Python implementations of TRESTLE, COBWEB/3, and COBWEB
MIT License
61 stars 18 forks source link

Are missing attributes handled correctly? #20

Closed cmaclell closed 7 years ago

cmaclell commented 8 years ago

There are two possible way to handle missing attributes that I can see:

1) Missing attributes are ignored and only the non missing attribute values are included in category utility calculations. However, the number of attributes will cause problems so expected correct guesses will need to be normalized by the number of attributes currently present.

2) The probability mass for "missing" a special value for an attribute is implicitly maintained.

We are currently going number 2, but I'm not convinced that this is the correct strategy. Maybe we should evaluate both to determine which yields better performance. Alternatively, it might be a flag that specifies which strategy is used.