Open miguel2488 opened 5 years ago
Hi,
In the source code of test_minibatch, it goes like this:
if arguments is None or isinstance(arguments, (dict, list)) and len(arguments) == 0:
if len(op_arguments) > 0:
raise ValueError('function expects %i arguments' %
len(op_arguments))
return {}
In order for the error to be raised, len(arguments) ==0. arguments is basically your variable data that you fed into test_minibatch. you should double check if variable data is correct.
Hi @delzac,
thak you again for your response. This is how i have configured my data variable for train and test:
#for i in range(0, int(num_minibatches_to_train)):
# Read a mini batch from the training data file
data=train_reader.next_minibatch(minibatch_size, input_map=input_map)
trainer.train_minibatch(data)
print_training_progress(trainer, i, training_progress_output_freq, verbose=1)
and this is the testing part:
test_input_map = {
y : test_reader.streams.labels,
x : test_reader.streams.features
}
test_minibatch_size = 512
num_samples = 10000
num_minibatches_to_test = num_samples // test_minibatch_size
test_result = 0.0
for i in range(num_minibatches_to_test):
# We are loading test data in batches specified by test_minibatch_size
# Each data point in the minibatch is a MNIST digit image of 784 dimensions
# with one pixel per dimension that we will encode / decode with the
# trained model.
data = test_reader.next_minibatch(test_minibatch_size, input_map=test_input_map)
eval_error = trainer.test_minibatch(data)
test_result = test_result + eval_error
# Average of evaluation errors of all test minibatches
print("Average test error: {0:.2f}%".format(test_result*100 / num_minibatches_to_test))```
As you can see, both are very similar. How is it possible that the first is working and not the second?
Have a look at my test data:
`|labels 0 0 0 1 0 0 0 0 0 0 0 |features 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0 30.0`
Did you check the outputs of test_reader.next_minibatch() is correct?
This what i'm getting when i try to check that output:
data = create_reader(test_file, False, input_dim, num_output_classes)
out[34]: <cntk.io.MinibatchSource; proxy of <Swig Object of type 'CNTK::MinibatchSourcePtr *' at 0x000000001E55F420> >
same result when i check in on train_reader
Do data.as_sequences() and print it out does everything still work?
Hi @delzac,
this what i'm getting when executing data.as_sequences()
:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-24-41715e7a32cc> in <module>
----> 1 data.as_sequences()
C:\ProgramData\Anaconda3\lib\site-packages\cntk\cntk_py.py in <lambda>(self, name)
3155 __setattr__ = lambda self, name, value: _swig_setattr(self, MinibatchSource, name, value)
3156 __swig_getmethods__ = {}
-> 3157 __getattr__ = lambda self, name: _swig_getattr(self, MinibatchSource, name)
3158
3159 def __init__(self, *args, **kwargs):
C:\ProgramData\Anaconda3\lib\site-packages\cntk\cntk_py.py in _swig_getattr(self, class_type, name)
81 if method:
82 return method(self)
---> 83 raise AttributeError("'%s' object has no attribute '%s'" % (class_type.__name__, name))
84
85
AttributeError: 'MinibatchSource' object has no attribute 'as_sequences'
data = C.io.MinibatchSource().next_minibatch() data is a currently type dict, check if its correct. d['x'].as_sequences() should give you access to those values.
The point here is this, the original error essentially says "len(arguments) ==0" which means whatever you fed into test_minibatch is empty. So you really have to debug thoroughly.
thank you for your answer. As you said, the MiniBatchSource.next_minibatch() object is dict type. Querying it with d['x'].as_sequences() as you suggested didn't work for me, instead i have to use d['x'] to index by the key i want to extract. But the data is all there, is not an empty object. I don't know what's happening, i tried following the logistic regression and multi layer perceptron tutorials, everything goes fine till the testing part. I just can't test my models.
Here's what the test_reader yields using the next_minibatch method:
read_test(test_file, input_dim, num_output_classes, 512)
out[124]: {features([15104]): MinibatchData(data=Value([297 x 1 x 15104], GPU), samples=297, seqs=297),
labels([11]): MinibatchData(data=Value([297 x 1 x 11], GPU), samples=297, seqs=297)}
Can you try switch your test_reader with your train_reader? Does it make a difference? If you use train_minibatch instead of test_minibatch is therea difference? As far asi can tell, I can't see any error in your code.
Anyway, immediately after data = test_reader.next_minibatch(), can you run print(len(data))?
Hi @delzac,
i used this modified function to create the test_reader.next_minibatch()
object:
def test_reader(path, is_training, input_dim, num_label_classes):
labelStream = C.io.StreamDef(field='labels', shape=num_output_classes, is_sparse=False)
featureStream = C.io.StreamDef(field='features', shape=input_dim, is_sparse=False)
deserializer = C.io.CTFDeserializer(path, C.io.StreamDefs(labels = labelStream, features = featureStream))
data = C.io.MinibatchSource(deserializer, randomize = False, max_sweeps = 1).next_minibatch(minibatch_size_in_samples = 512)
return data
Then when i do this:
data = test_reader(test_file, False, input_dim, num_output_classes) print(type(data), len(data)
i get this:
dict, 2
I also tried switching test_reader by train_reader in my code as you suggested like this:
# Test the model
test_input_map = {
y : train_reader.streams.labels,
x : train_reader.streams.features
}
# Test data for trained model
test_minibatch_size = 512
num_samples = 10000
num_minibatches_to_test = num_samples // test_minibatch_size
test_result = 0.0
path = 'data/out/test.txt'
for i in range(int(num_minibatches_to_test)):
# We are loading test data in batches specified by test_minibatch_size
# Each data point in the minibatch is a MNIST digit image of 784 dimensions
# with one pixel per dimension that we will encode / decode with the
# trained model.
test_data = train_reader.next_minibatch(test_minibatch_size, input_map=test_input_map)
eval_error = trainer.train_minibatch(test_data)
test_result = test_result + eval_error
# Average of evaluation errors of all test minibatches
print("Average test error: {0:.2f}%".format(test_result*100 / num_minibatches_to_test))
And this is what i got:
Minibatch: 4400, Loss: 0.6998, Error: 23.05%
Minibatch: 4500, Loss: 0.6963, Error: 20.70%
Minibatch: 4600, Loss: 0.7128, Error: 20.31%
Training took 877.8 sec
Average test error: 100.00%
I'm totally lost with this, it seems that the test reader is not working properly with train.test_minibatch()
and test_reader.nextminibatch() but the test file seems correctly created using my create_reader function, if i just have a look at the files created, they are MinibatchSource objects containing the whole information about features and labels, i just don't know where the problem is.
create_reader(test_file, input_dim, num_output_classes, num_output_classes)
out[23]: <cntk.io.MinibatchSource; proxy of <Swig Object of type 'CNTK::MinibatchSourcePtr *' at 0x0000027CB82122A0> >
Just realised that if i do this:
create_reader(test_file, input_dim, num_output_classes, num_output_classes).next_minibatch(512)
i'm getting this:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-72-7b834874f2d6> in <module>
----> 1 create_reader(test_file, input_dim, num_output_classes, num_output_classes).next_minibatch(512)
C:\ProgramData\Anaconda3\lib\site-packages\cntk\internal\swig_helper.py in wrapper(*args, **kwds)
67 @wraps(f)
68 def wrapper(*args, **kwds):
---> 69 result = f(*args, **kwds)
70 map_if_possible(result)
71 return result
C:\ProgramData\Anaconda3\lib\site-packages\cntk\io\__init__.py in next_minibatch(self, minibatch_size_in_samples, input_map, device, num_data_partitions, partition_index)
330 minibatch_size_in_samples,
331 num_data_partitions,
--> 332 partition_index, device)
333
334 if not mb:
C:\ProgramData\Anaconda3\lib\site-packages\cntk\cntk_py.py in get_next_minibatch(self, *args)
3179
3180 def get_next_minibatch(self, *args):
-> 3181 return _cntk_py.MinibatchSource_get_next_minibatch(self, *args)
3182 MinibatchSource_swigregister = _cntk_py.MinibatchSource_swigregister
3183 MinibatchSource_swigregister(MinibatchSource)
RuntimeError: Reached the maximum number of allowed errors while reading the input file (data/out/test.txt).
[CALL STACK]
> Microsoft::MSR::CNTK::IDataReader:: InitProposals
- Microsoft::MSR::CNTK::IDataReader:: InitProposals (x6)
- CreateCompositeDataReader (x5)
- Microsoft::MSR::CNTK::TracingGPUMemoryAllocator:: operator=
- std::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase>::enable_shared_from_this<Microsoft::MSR::CNTK::MatrixBase> (x2)
- Microsoft::MSR::CNTK::TracingGPUMemoryAllocator:: operator=
You cannot keep increasing the minibatch size, it is limited to your graphic card's VRAM size. Anyway, i can't help you here, it's just a matter of debugging and you know your own code better than i do.
Base on what i see here, your train_reader works with trainer.train_minibatch, does it work with trainer.test_minibatch? Can you also test is test_reader works with trainer.train_minibatch and trainer.test_minibatch. You can debug from there.
No it doesn't, it is set to trainer.test_minibatch since the very beginning, in the testing part, it has to be like that to test the model. What do you mean by keep the increase minibatch size?? the next_minibatch argument needs a fixed minibatch_size value, i can't leave that empty.
If you keep increasing the minibatch size, your gpu will not have enough memory to do the computation.
Yeah i know that. I'm using the values of the CNTK's MNIST tutorial, that i previously ran withno problems. Anyway, you've already done a lot for me here, i don't want to keep bothering you with this. I will continue checking things out and see if i can fix it. Thank you very much for all your help :)
I had a similar problem with a similar script. Turned out the last batch was empty, so the test loop had one iteration too many.
Hi,
this error will not stop showing up, making impossible to test the model. No problems with the evaluation part, but it's impossible to make it run the test part. I'm using a very similar code from the MNIST tutorial CNN with MNIST Dataset.
Here's my train_test function:
And this is my do_train_test function:
And now, this is the error it yields after the model is trained successfully:
Please, how can i prevent this from happening?? What am i doing wrong.
Thanks in advance