KevinCoble / AIToolbox

A toolbox of AI modules written in Swift: Graphs/Trees, Support Vector Machines, Neural Networks, PCA, K-Means, Genetic Algorithms
Apache License 2.0
793 stars 87 forks source link

Can the input a discrete value? #6

Closed heuism closed 7 years ago

heuism commented 7 years ago

I haven't tested yet but i would like to ask if the input can be discrete value like 'Hot' 'Cold' 'Windy' instead of 1 2 3 or can say continuous value?

Thanks a lot for your help

KevinCoble commented 7 years ago

The code couldn't know in advance what enumeration you wanted to use. Luckily, Swift provides enumerations with 'raw values' that can be used to add clarity to your code while still leaving the library with general-purpose integers. Define your enumeration like this:

enum POSTag : Int { case Noun=0 case Verb case Adjective . . . }

and pass an enumeration to the algorithm with the .rawValue property, i.e. Noun.rawValue. To translate a returned value use POSTag(rawValue: 7) [where your enumeration is used in place of 'POSTag']. Note that this returns an optional value, and a nil will be returned if the raw value is out of range of the case tags.

heuism commented 7 years ago

I would like to state again in case i missed your point.

So it would be like

enum FeaturesVal : Int {
case Hot = 2.0
case Cool = 3.0
case Windy = 4.0
.
.
.
case Notplay = 0.0
case Play = 1.0
}

and then where i want to use the value i can say for example:

trainData.addDataPoint(input: [Hot.rawValue], output: [Play.rawValue])

You can correct me if i am wrong. So this lib support only double and you CANNOT input thing straight in like

trainData.addDataPoint(input: ["Hot", "Humid"], output: ["Notplay])?

thanks for your time

KevinCoble commented 7 years ago

Yes, but you need to use integer raw values if you are doing classification. You have the type set (enum FeaturesVal : Int), but are using floating values in the cases. Change those to

case Hot=2 case Cool=3 etc.

No, you cannot enter things like 'trainData.addDataPoint(input: ["Hot", "Humid"], output: ["Notplay])?', but with the enumeration additions you can do something close : trainData.addDataPoint(input: [Feature.Hot.rawValue, Feature.Humid.rawValue], output: Result.NotPlay.rawValue)?

It would also be possible to write your own DataSource class, using the MLClassificationDataSet protocol. That class could have constructors that took "Hot", "Humid", and "NotPlay" as entries (but still return double vectors and integer labels when used by the classifier routine). But the details of that is a topic that would be hard to explain in a short period of time

KevinCoble commented 7 years ago

Closing this issue at this time. ( I believe heuism got his example working)