Open 7pandeys opened 8 months ago
Hi @7pandeys, thanks for raising the issue.
The Resources configuration section on the page you've linked has exactly the information about using GPUs. Initial configuration generated by kedro vertexai init
also creates the vertexai.yml
which contains an example of configuration for nodes with GPUs on Vertex AI.
We're open to improvements on that part - what do you propose?
Config generated by kedro vertexai init
@marrrcin thanks for response. Is there a specific parameter or syntax that allows us to specify the machine type or CPU type in the vertexai.yml configuration? If not, what would be the recommended approach to achieve this?
Related links https://cloud.google.com/compute/docs/cpu-platforms https://cloud.google.com/compute/docs/machine-resource
Follow this guide, our plugin is fully compatible with this approach: https://cloud.google.com/vertex-ai/docs/pipelines/machine-types
Follow this guide, our plugin is fully compatible with this approach: https://cloud.google.com/vertex-ai/docs/pipelines/machine-types
I don't understand your questions. You can configure machine types as you want in vertexai.yml
- the configuration in the plugin exposes the configuration available in native Vertex AI. That means that whatever you define in the vertexai.yml
configuration file, it will be used in the plugin to set appropriate CPU/memory/GPU resources + node selectors on the Vertex AI side, you don't have to use KFP directly.
resources:
# For nodes that require more RAM you can increase the "memory"
data_import_step:
memory: 4Gi
# Training nodes can utilize more than one CPU if the algoritm
# supports it
model_training:
cpu: 8
memory: 8Gi
gpu: 1
# Default settings for the nodes
__default__:
cpu: 1000m
memory: 2048Mi
node_selectors:
model_training:
cloud.google.com/gke-accelerator: NVIDIA_TESLA_T4
I suggest you try to configure our plugin first, then see whether it works for you and whether it matches your requirements on that part.
Problem: I'm encountering difficulty in defining the CPU and GPU machine types with respect to nodes and pipelines in vertexai.yml within the Kedro-Vertex framework.
Expected Behavior: I expect to be able to specify the CPU and GPU machine types for nodes and pipelines in vertex.yml to effectively utilize CPU and GPU resources as needed.
Current Behavior: I've searched through the documentation and codebase but haven't found clear instructions on how to achieve this. This makes it challenging to optimize the resource utilization for my specific workflow.
Steps to Reproduce:
Additional Information:
Environment:
Suggested Solution: It would be helpful to provide more detailed documentation or examples on how to define CPU and GPU machine types for nodes and pipelines in vertex.yml. Alternatively, if this feature is not yet supported, it would be great to know the current status and any workarounds.
Related links https://github.com/getindata/kedro-vertexai/blob/develop/kedro_vertexai/config.py https://kedro-vertexai.readthedocs.io/en/0.9.1/source/02_installation/02_configuration.html
Notes: vertexai.yml is generated by command
kedro vertexai init
This issue aims to improve resource management and clarity within Kedro-Vertex, making it easier for users to define CPU and GPU machine types for their nodes and pipelines. Your attention to this matter is greatly appreciated.