Closed danieldk closed 5 years ago
Disable mixed precision in the tagger (as opposed to the trainer), since we are usually running prediction on CPUs anyway.
Sounds like the least invasive solution. It may be interesting to test if there are differences between MP & no MP when running on GPU for tagging.
Alternatively, we could write an inference graph that strips all training related parts which would also help to decouple train-binary-graph compatibility from tag-binary-graph compatibility (https://github.com/stickeritis/sticker/issues/144#issuecomment-545363205).
We now always enable auto mixed precision when the graph has this option in the meta data. However, when enabling auto mixed precision with CPU prediction, prediction becomes very slow (even though Tensorflow reports not enabling mixed precision, because it is not supported for CPUs).
From a cursory look in htop, it seems that the processes become single-threaded.
Possible solutions:
@twuebi any opinions?