Hello! I've found a performance issue in tensorlayer/examples: batch() should be called before map(), which could make your program more efficient. Here is the tensorflow document to support it.
Detailed description is listed below:
examples/quantized_net/tutorial_binarynet_cifar10_tfrecord.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_binarynet_cifar10_tfrecord.py: test_ds = test_ds.batch(batch_size)(here) shoule be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_dorefanet_cifar10_tfrecord.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_dorefanet_cifar10_tfrecord.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_quanconv_cifar10.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_quanconv_cifar10.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_ternaryweight_cifar10_tfrecord.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/quantized_net/tutorial_ternaryweight_cifar10_tfrecord.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/data_process/tutorial_fast_affine_transform.py: dataset = dataset.batch(batch_size)(here) should be called before dataset = dataset.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/data_process/tutorial_tf_dataset_voc.py: ds = ds.batch(batch_size)(here) should be called before ds = ds.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/basic_tutorials/tutorial_cifar10_cnn_static.py: train_ds = train_ds.batch(batch_size)(here) should be called before train_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/basic_tutorials/tutorial_cifar10_cnn_static.py: test_ds = test_ds.batch(batch_size)(here) should be called before test_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())(here).
examples/deprecated_tutorials/tutorial_imagenet_inceptionV3_distributed.py: dataset = dataset.batch(batch_size)(here) should be called before dataset = dataset.map(_map_fn, num_parallel_calls=max_cpus)(here).
Besides, you need to check the function called in map()(e.g., _map_fn called in dataset.map()) whether to be affected or not to make the changed code work properly. For example, if _map_fn needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).
Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.
Hello! I've found a performance issue in tensorlayer/examples:
batch()
should be called beforemap()
, which could make your program more efficient. Here is the tensorflow document to support it.Detailed description is listed below:
train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here).test_ds = test_ds.batch(batch_size)
(here) shoule be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here).train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here).test_ds = test_ds.batch(batch_size)
(here) should be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here).train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here).test_ds = test_ds.batch(batch_size)
(here) should be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here).train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here).test_ds = test_ds.batch(batch_size)
(here) should be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here).dataset = dataset.batch(batch_size)
(here) should be called beforedataset = dataset.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())
(here).ds = ds.batch(batch_size)
(here) should be called beforeds = ds.map(_map_fn, num_parallel_calls=multiprocessing.cpu_count())
(here).train_ds = train_ds.batch(batch_size)
(here) should be called beforetrain_ds = train_ds.map(_map_fn_train, num_parallel_calls=multiprocessing.cpu_count())
(here).test_ds = test_ds.batch(batch_size)
(here) should be called beforetest_ds = test_ds.map(_map_fn_test, num_parallel_calls=multiprocessing.cpu_count())
(here).dataset = dataset.batch(batch_size)
(here) should be called beforedataset = dataset.map(_map_fn, num_parallel_calls=max_cpus)
(here).Besides, you need to check the function called in
map()
(e.g.,_map_fn
called indataset.map()
) whether to be affected or not to make the changed code work properly. For example, if_map_fn
needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.