punkrockpolly / babel-brain

Takes 2 input files (one english, one spanish) and uses letter-weighting to guess the language of the input file
6 stars 0 forks source link

fix featureNormalize #12

Closed punkrockpolly closed 10 years ago

punkrockpolly commented 10 years ago
# Refactor to fix featureNormalize - DOES NOT WORK
def featureNormalize(feature_dict):
    feature_vector = dict_to_vector(feature_dict)
    # returns a normalized version of X where the mean value of
    # each feature is 0 and the standard deviation is 1
    X_norm = feature_vector
    num_features = feature_vector.length()
    mu = np.zeros(1, num_features)
    sigma = np.zeros(1, num_features)

    for i in num_features:
        mu[i] = np.mean(X(:, i))
        X_norm[:, i] = X(:, i) - mu(i)
        sigma[i] = np.std(X(:, i))
        X_norm[:, i] = X_norm(:, i) / sigma(i)

    normalization_dict = {}
    normalization_dict['X_norm'] = X_norm
    normalization_dict['mu'] = mu
    normalization_dict['sigma'] = sigma

    return normalization_dict
punkrockpolly commented 10 years ago

implemented via commit 685010680bbcd9f3d128f919740334f4d247a95a