kleveross / klever-model-registry

Cloud Native Machine Learning Model Registry
https://kleveross.github.io/klever-model-registry/api/
Apache License 2.0
80 stars 25 forks source link

[feature request] Estimate resource requirement for the model. #26

Open gaocegege opened 4 years ago

gaocegege commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug /kind feature

What happened:

Model servers like TFServing has a mechanism to estimate the resource requirements for the model:

Status EstimateResourceFromPath(const string& path, FileProbingEnv* env,
                                ResourceAllocation* estimate) {
  if (env == nullptr) {
    return errors::Internal("FileProbingEnv not set");
  }

  std::vector<string> descendants;
  TF_RETURN_IF_ERROR(GetAllDescendants(path, env, &descendants));
  uint64 total_file_size = 0;
  for (const string& descendant : descendants) {
    if (!(env->IsDirectory(descendant).ok())) {
      uint64 file_size;
      TF_RETURN_IF_ERROR(env->GetFileSize(descendant, &file_size));
      total_file_size += file_size;
    }
  }
  const uint64 ram_requirement =
      total_file_size * kResourceEstimateRAMMultiplier +
      kResourceEstimateRAMPadBytes;

  ResourceAllocation::Entry* ram_entry = estimate->add_resource_quantities();
  Resource* ram_resource = ram_entry->mutable_resource();
  ram_resource->set_device(device_types::kMain);
  ram_resource->set_kind(resource_kinds::kRamBytes);
  ram_entry->set_quantity(ram_requirement);

  return Status::OK();
}

We can refer to the design to support resource requirement recomendation for users.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

gaocegege commented 4 years ago

kResourceEstimateRAMMultiplier = 1.2

Thus TFServing simply estimates the resource requirement by size of the model files * 1.2