DigitalSlideArchive / HistomicsStream

A whole-slide image reader for TensorFlow
Apache License 2.0
22 stars 6 forks source link

Evaluate performance #102

Open Leengit opened 1 year ago

Leengit commented 1 year ago

In particular, are we leveraging the graph execution optimizations (e.g., parallelization, memory management, GPU usage) of tensorflow and torch or do we need to do more to get that?

Leengit commented 1 year ago

@cooperlab says: look at TF MultiWorker strategy - https://www.tensorflow.org/tutorials/distribute/multi_worker_with_keras. We can help with this. Key questions are:

Leengit commented 1 year ago

Tensorflow does autosharding so we shouldn't have to explicitly shard the tensorflow.Dataset. We could add convenience functions so that the the likes of global_batch_size = num_workers * batch_size_per_worker are satisfied.

If the user has already created a model, and we want to convert or wrap that model so that it is as if the model had been created within a with strategy.scope(): Python block for some distributed strategy, could we do that after the fact? It might work to write the model to disk, and then read it back in within a strategy scope block; I have queried StackOverflow for other possibilities.