Performance issue in dataset.py (by P3)

DLPerf commented 2 years ago

Hello! I've found a performance issue in dataset.py: dataset = dataset.batch(batch_size)(here) should be called before dataset = dataset.map(_parse, num_parallel_calls=AUTOTUNE)(here), which would make your program more efficient.

Here is the tensorflow document to support it.

Besides, you need to check the function _parse(here) called in dataset.map() whether to be affected or not to make the changed code work properly. For example, if _parse needs data with shape (x, y, z) as its input before fix, it would require data with shape (batch_size, x, y, z).

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

yinguobing commented 2 years ago

First thanks for your feedback! Just out of curious, am I talking to a human being?

DLPerf commented 2 years ago

Sorry, I'm not a robot... @yinguobing

yinguobing / cnn-facial-landmark

Performance issue in dataset.py (by P3) #111