DataBiosphere / dsub

Open-source command-line tool to run batch computing tasks and workflows on backend services such as Google Cloud.
Apache License 2.0
265 stars 44 forks source link

Multi-region support google-batch #280

Open rivershah opened 11 months ago

rivershah commented 11 months ago

google-cls-v2 has powerful multi region support through wild card matching or providing lists of regions. It appears that google-batch is lacking this feature.

Multi-region support is a much loved and used feature with google-cls-v2. Can you please verify that indeed google-batch does not have this. And if not, would it be possible to work with google batch developers to introduce this feature by the time Cloud Life Sciences gets removed. Thank you.

mbookman commented 11 months ago

Thanks for the input @rivershah.

A cursory look indicates that Batch supports this via the LocationPolicy:

https://cloud.google.com/batch/docs/reference/rest/v1alpha/projects.locations.jobs#locationpolicy

We'll test it out and look to wire it up if it works as expected.

rivershah commented 11 months ago

@mbookman Thanks for looking. My understanding of the docs is that batch api will raise an error if multiple regions.

Only one region or multiple zones in one region is supported now

It appears the multi-region support may not have been ported over in batch. Await the results of your experimentation with this as I can't seem to get my jobs to schedule if I specify multiple regions as VM enter error state.

For context, why the multi-region support is so useful is that greatly simplifies job submission for hard to find resources such as high memory nodes and GPU accelerators. It is typical that a large parallel GPU dsub tsv submission will find resources across geographically widely separated regions

anngregory commented 11 months ago

Does dsub work with google batch?

rivershah commented 10 months ago

@mbookman Happy new year! I am still pretty sure that batch as implemented on google's side, does not support submitting a job to us wide regions, which google-cls-v2 does allow. This would be a major feature regression. I don't think this a dsub limitation.

Can you please verify if I what I am saying is correct. If so, we will need help determine if this feature can be implemented in batch

rivershah commented 7 months ago

@mbookman @wnojopra As the google-cls-v2 is headed for removal soon enough, requesting that we look at this feature regression. Thank you

mbookman commented 7 months ago

Hey @rivershah !

Sorry about the delay in following up. We did check in with the Batch team regarding this. The lack of the multi-region support is presently intentional in the sense that it was not considered to have high utility. It would be great if we could get more input from you on your use case and where you see it giving value.

One of the key drivers of this feature not being added to Batch is the change in Cloud pricing in 2022 where accessing data from multi-region buckets to regional buckets became something that incurs Data Transfer Out charges (fka egress charges).

https://cloud.google.com/storage/pricing-announce#network

Reading data in a Cloud Storage bucket located in a multi-region from a Google Cloud service located in a region on the same continent will no longer be free; instead, such moves will be priced the same as general data moves between different locations on the same continent.

                        Northern America
Northern America   $0.02/GB

Prior to those pricing changes, access to data in US multi-region bucket to any of the US regions was free. So the Cloud view on this is that generally people will want to use regional buckets and regional VMs.

So can you share your use case where this pricing change has not impacted you and where you'd get high value from multiple regions?

Thanks.

rivershah commented 3 months ago

Hi @mbookman,

Apologies for the delay. The multi-region feature is crucial for several reasons: