Closed nlp-study closed 8 years ago
Glad it helps your understanding. How do you load the dataset? Also, do you know the error to expect from random guessing?
it's a surprise for me that you answered my question.The below is my code to run your program. If you could tell me the bug , it will be a great honour for me. thanks again!
def read_data(data_file):
example_list = []
for line in open(data_file,'r'):
line_list = line.split(',')
vector_str_list = line_list[0:4]
target_str_list = line_list[4:7]
vector_list = [float(x) for x in vector_str_list]
target_list = [float(x) for x in target_str_list]
data = np.array(vector_list)
target = np.array(target_list)
print("data : ",data)
print("target : ",target)
example = Example(vector_list, target_list)
example_list.append(example)
print("len of example_list:",len(example_list))
return example_list
num_inputs = 4
num_outputs = 3
print("****init Network")
network = Network([
Layer(num_inputs, Identity),
Layer(12, Relu),
Layer(12, Relu),
Layer(12, Relu),
Layer(num_outputs, Softmax),
])
print("****init weights")
weight_scale = 0.01
weights = Matrices(network.shapes)
weights.flat = np.random.normal(0, weight_scale, len(weights.flat))
print("****define backprop and decent")
backprop = Backprop(network, cost=SquaredError())
decent = GradientDecent()
print("weights:",weights)
print("****get data")
data_file = '../../dataset/train.txt'
examples = read_data(data_file)
for itera_numb in range(100):
for example in examples:
gradient = backprop(weights, example)
weights = decent(weights, gradient, learning_rate=0.01)
print("weights:",weights)
error = 0
for example in examples:
prediction = network.feed(weights, example.data)
if np.argmax(prediction) != np.argmax(example.target):
# print("example.data:",example.data)
# print("prediction:",prediction)
# print("example.target:",example.target)
# print("**************************")
error += 1 / len(examples)
print('Testing error', round(100 * error, 2), '%')
And the train.txt: http://pastebin.com/aV6ePCuY
Hi, I edited your post. Please format code blocks as described here next time.
The problem with your example seems to lie in the network parameters rather than the code. I got it down to 3% error with a single hidden ReLU layer of 20 neurons and initial weight scale of 0.05.
As a rule of thumb, the weight scale should be about 1 / hidden_layer_size
so that the overall activation doesn't change too much between layers. Also, 3 hidden layers is a lot for such a simple dataset.
By the way, it's good to validate against another part of the dataset since a larger model could just memorize the individual training examples. E.g., use 75% of the examples for training and the rest for evaluation.
Thanks for your help !
Thanks your great work in this project, I use your code as good case to learn neural network. But I find this code always classification error as 66.7% in classification with iris data. is there a example using those simple data?