bazelbuild / bazel

a fast, scalable, multi-language and extensible build system
https://bazel.build
Apache License 2.0
23.21k stars 4.06k forks source link

`resource_set` should allow specify the resource dict directly. #15187

Open cloudhan opened 2 years ago

cloudhan commented 2 years ago

Description of the problem / feature request:

Feature Request for directly specify the resoure requirement via resource_set

Feature requests: what underlying problem are you trying to solve with this feature?

resource_set support for actions.run and actions.run_shell is added in https://github.com/bazelbuild/bazel/commit/d7f0724b6b91b6c57039a1634ff00ccebd872714

However, per the comment https://github.com/bazelbuild/bazel/issues/6477#issuecomment-1065340904 https://github.com/bazelbuild/bazel/issues/6477#issuecomment-1077831781 and https://github.com/bazelbuild/bazel/issues/6477#issuecomment-1083277142. The API is not flexible enough.

Firstly, the callbale only receive a input_size on calling-back. For example, rules_foreign_cc can invoke underlying make command, with the -j option. The -j option for make command is not dependent on input_size in any way. The rule author of rules_foreign_cc can just use a magic number, say -j8 here. Or get the cpu count info from somewhere else. But the info cannot be determined from the input_size in any means. And there are many more similar cases that the parallelism cannot be simply infered from input_size

Secondly, the API receive a callable, which is weird in Bazel, since Bazel is not designed around high order function at first.

Thirdly, the callable can only be top-level def-ed function, which is very limited.

So it is suggested to simply accept a ResourceSetInfo or dict at starlark api actions.run and actions.run_shell

ctx.actions.run(
    outputs = ...,
    executable = ...,
    resource_set = ResourceSetInfo(
        cpu = 8,
        memory = 512,
        ...,
    ),
)

And leave the responsibility of inferencing the number and constructing the ResourceSetInfo to the rule authors to maximize the rule authoring flexibility.

meisterT commented 2 years ago

cc @wilwell

meisterT commented 2 years ago

Also cc @brandjon because of potential Starlark API changes

fmeum commented 1 year ago

@meisterT @wilwell @brandjon Have you been able to think more about this request? rules_go would benefit from having resource_set stabilized in either form, so I would be willing to invest time in whatever is needed to move this forward.

fmeum commented 1 year ago

For most actions I have worked on more closely, especially JVM/Go, resource requirements are mostly influenced by direct dependencies (known at analysis time) rather than the size of the inputs depset. Large SDKs could also lead to overestimated resource consumptions when basing the computation on the size of inputs alone.

Passing in a plain dictionary, one would lose the ability to factor the host OS into the computation. In case this feature is already being used at Google, do you see noticeable differences between platforms?

Realyral commented 1 year ago

Copy

On Fri, Dec 2, 2022, 11:46 AM Fabian Meumertzheim @.***> wrote:

For most actions I have worked on more closely, especially JVM/Go, resource requirements are mostly influenced by direct dependencies (known at analysis time) rather than the size of the inputs depset. Large SDKs could also lead to overestimated resource consumptions when basing the computation on the size of inputs alone.

Passing in a plain dictionary, one would lose the ability to factor the host OS into the computation. In case this feature is already being used at Google, do you see noticeable differences between platforms?

— Reply to this email directly, view it on GitHub https://github.com/bazelbuild/bazel/issues/15187#issuecomment-1335521698, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4INSZT6GRA4PHRCLLDJST3WLIRWPANCNFSM5SU3XC3Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>

eric-skydio commented 1 year ago

The workaround I'm currently using is as follows, which is a terrible ugly hack. Please fix this.

# This is a terrible hack to get around the fact that we can't set resources dynamically until
# https://github.com/bazelbuild/bazel/issues/15187 is fixed. I hate this, but it's the best I can think of.
def _1_cpu(os, num_inputs):
    return {"cpu": 1}

def _2_cpu(os, num_inputs):
    return {"cpu": 2}

def _4_cpu(os, num_inputs):
    return {"cpu": 4}

def _8_cpu(os, num_inputs):
    return {"cpu": 8}

def _get_resource_set(jobs):
    """Returns a resource set for the given number of jobs, underestimating if needed."""
    if jobs < 2:
        return _1_cpu
    elif jobs < 4:
        return _2_cpu
    elif jobs < 8:
        return _4_cpu
    else:
        return _8_cpu
github-actions[bot] commented 4 months ago

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.

cloudhan commented 4 months ago

not stale

dgoldstein0 commented 1 month ago

why not both? support functions or some sort of direct dict or object