googleapis / google-cloud-java

Google Cloud Client Library for Java
https://cloud.google.com/java/docs/reference
Apache License 2.0
1.88k stars 1.06k forks source link

Batch - Support setting the working directory #10670

Closed loicmathieu closed 1 week ago

loicmathieu commented 3 months ago

Is your feature request related to a problem? Please describe. When you create a Batch Job, there is no way to set the working directory; the documentation says to cd WORKING_DIRECTORY. It would be convenient, especially when working with volumes, to be able to se the working directory to the volume mount path.

Describe the solution you'd like We should be able to set the task runnable working directory or at least the container when using containers as container engine support setting the working directory.

runnableBuilder.setWorkingDirectory(MOUNT_PATH);

// OR

containerBuilder.setWorkingDirectory(MOUNT_PATH);

Describe alternatives you've considered Using cd as a first command as the documentation explain.

suztomo commented 3 months ago

What Maven artifact do you use?

the documentation says

Would you share URL of the document?

loicmathieu commented 3 months ago

What Maven artifact do you use?

We're using com.google.cloud:google-cloud-batch.

The documentation I use is here: https://cloud.google.com/batch/docs/create-run-job-storage#console It didn't explain how to change to current working directory in Java but only a general remark: image

suztomo commented 3 months ago

Checking

suztomo commented 3 months ago

That "cd" command is just to demonstrate how to create a file in the persistent disk. In fact, in the gcloud example, the full path to the file is specified without "cd" command:

image

Therefore, you shouldn't be blocked by the lack of setting working directory.

loicmathieu commented 3 months ago

Yes, I realize that, what I ask is to be able, via the Java SDK, to set the working directory.

I only refer to the cd command on the doc to explain what I want to avoid, using cd or a direct reference to the working directory.

We mount a volume in the batch job, this volume is created from a bucket where we create a random folder for each Job. We use this random folder as the mount path so we would like to set the working directory of the Job to the mount path as we upload files in the bucket before launching the job so it would be more convenient.

We have a generic task runner that can run arbitrary scripts (shell, Python, Julia, ...) in a lot of different platform including Google Cloud so we need to offer a coherent way of working on all environments and the user of our tasks runner doesn't know anything about Google Batch or any other supported task runner.

suztomo commented 3 months ago

a generic task runner

I'd like to know more about the abstraction. Is that Docker containers (e.g., https://cloud.google.com/batch/docs/samples/batch-create-container-job)?

loicmathieu commented 3 months ago

Yes we use container Job.

The abstraction that our user use are very high level, they describe a task in YAML and a runner (Docker, k8s, Google Batch, AWS Batch, ...) and we create the needed resources on the target runner.

suztomo commented 3 months ago

If you use Docker container, doesn't WORKDIR (https://docs.docker.com/reference/dockerfile/#workdir) solve your problem?

loicmathieu commented 3 months ago

I don't understand, I'm using the Batch Java SDK not the Docker SDK. There is not way to set the working directory via the Batch Java SDK, this is why I opened this issue.

suztomo commented 3 months ago

(Re-reading your comments) Am I right this is randomness is the reason why Dockerfile's WORKDIR does not work for your case?

We mount a volume in the batch job, this volume is created from a bucket where we create a random folder for each Job. We use this random folder as the mount path so we would like to set the working directory of the Job to the mount path as we upload files in the bucket before launching the job so it would be more convenient.

loicmathieu commented 3 months ago

We run arbitrary Docker images that the user can supply so we cannot rely on the Dockerfile's WORKDIR, and even if we could, we create a random directory (we should be able to overcome this but the first reason will still stood).

loicmathieu commented 3 months ago

For the record, CloudRun allow setting a working directory from the SDK when creating a container

suztomo commented 3 months ago

Can I get the URL you read for Cloud Run's case?

loicmathieu commented 3 months ago

https://github.com/googleapis/google-cloud-java/blob/02d2b5eab0a9cef7f4db703d6c4a1d7577868108/java-run/proto-google-cloud-run-v2/src/main/java/com/google/cloud/run/v2/Container.java#L3627

suztomo commented 3 months ago

Thank you for the reference. I explored the API definition but didn't find similar properties in the Batch API (https://cloud.google.com/batch/docs/reference/rest/v1/projects.locations.jobs#Job). Unfortunately this problem cannot be solved by Java SDK (this repository). Would you sent the feature request of setting the working directory through https://cloud.google.com/batch/docs/get-started#get-support?

To provide any feedback or feature requests for Batch, ... For all other feedback about Batch, select "Product feedback."

loicmathieu commented 3 months ago

Done: https://issuetracker.google.com/issues/336164416

mpeddada1 commented 1 week ago

Thank you for the link! Closing this issue in this repo since it is dependent on https://issuetracker.google.com/issues/336164416.