New workflows may require changes to tf-serving and consumer settings.

Describe the bug The current cluster configuration has been well tested on a NVIDIA V100 GPU and on a typical segmentation workflow. However, depending on the model and the hardware used in future clusters, there are a few settings that may need to be tweaked.

`tf-serving`

MAX_BATCH_SIZE: The maximum number of batches that tf-serving will process in a given duty cycle. If the job is using very large input tensors, this batch size may need to be decreased
MAX_ENQUEUED_BATCHES: The number of batches that will sit in the work queue waiting to be processed. If the requests have a very large payload tf-serving to be evicted due to memory issues, and this parameter should be decreased.

`redis-consumer`

TF_MAX_BATCH_SIZE: The number of batches to send to the model server. This value MUST be less than or equal to MAX_BATCH_SIZE above and may need to be altered for future workflows.
GRPC_TIMEOUT: The length of time to wait for a gRPC inference request. If a model's inference time is quite slow, this may need to be adjusted to prevent timeouts.

Additional context For more notes on the interplay between these settings and the hardware itself, please review this related issue.

vanvalenlab / kiosk-console

New workflows may require changes to tf-serving and consumer settings. #356

`tf-serving`

`redis-consumer`