mila-iqia / fuel

A data pipeline framework for machine learning
MIT License
867 stars 268 forks source link

Potential bug with DataStream.default_stream #330

Closed yiulau closed 8 years ago

yiulau commented 8 years ago

The iteration_scheme attribute gets erased when default_stream is applied to DataStream. Example:

data_stream = DataStream(mnist,iteration_scheme=SequentialScheme(mnist.num_examples, batch_size=256))
data_stream.__dict__

{'_fresh_state': True,
 'axis_labels': {u'features': (u'batch', u'channel', u'height', u'width'),
  u'targets': (u'batch', u'index')},
 'data_state': None,
 'dataset': <fuel.datasets.mnist.MNIST at 0x7fbe5c0ed810>,
 'iteration_scheme': <fuel.schemes.SequentialScheme at 0x7fbe5c1caad0>}

became

data_stream = DataStream.default_stream(mnist,iteration_scheme=SequentialScheme(mnist.num_examples, batch_size=256))
data_stream.__dict__

{'_produces_examples': False,
 'axis_labels': {u'features': (u'batch', u'channel', u'height', u'width'),
  u'targets': (u'batch', u'index')},
 'data_stream': <fuel.transformers.ScaleAndShift at 0x7fbe5c0bb110>,
 'dtype': 'float32',
 'iteration_scheme': None,
 'which_sources': ('features',)}

The fuel version is 0.2.0 .

This is problematic when we want to use blocks.extensions.ProgressBar , which queries the num_batches, num_examples and batchsize from IterationScheme (quoting from blocks' documentation. )

bartvm commented 8 years ago

Duplicate of https://github.com/mila-udem/fuel/issues/10 (also discussed in https://github.com/mila-udem/blocks/pull/549 and https://github.com/mila-udem/blocks/issues/948)