fastmachinelearning / SonicCMS

Services for Optimized Network Inference on Coprocessors (for CMS)
8 stars 8 forks source link

Initialize local CPU server if remote server not available #12

Closed kpedro88 closed 3 years ago

kpedro88 commented 4 years ago

Initializing a local CPU server when the requested remote server is not available can serve as a fallback option. Open questions:

  1. At what stage should checking availability and initializing the local server take place? (python config, C++ module initialization, ....)
  2. How to control # threads used by the local server? see https://github.com/triton-inference-server/server/issues/2018

If a local CPU server is used, SonicTriton clients should be forced to use Sync mode, to prevent contention between CMSSW and the local server (otherwise they would both try to use the same threads).

Assigned to: @kpedro88

kpedro88 commented 3 years ago

See https://github.com/cms-sw/cmssw/compare/master...kpedro88:TritonService for a functioning draft version of TritonService with automatic fallback support. Notes:

kpedro88 commented 3 years ago

draft PR now submitted: https://github.com/cms-sw/cmssw/pull/32576

kpedro88 commented 3 years ago

minor followup: https://github.com/cms-sw/cmssw/pull/32861