broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
993 stars 359 forks source link

Tutorials for Getting started on Google Cloud with the Genomics Pipelines API does not work #5594

Open freeseek opened 4 years ago

freeseek commented 4 years ago

I have tried the tutorial to run Cromwell on Google Cloud. I did not get very far.

I have followed the long set of instructions. I have logged in with my <google-user-id>, I have set my own <google-project-id>. I have created my own bucket. I have generate my service account key with the command:

gcloud iam service-accounts keys create sa.json --iam-account "$EMAIL"

Then I run the hello.wdl with the command:

GOOGLE_APPLICATION_CREDENTIALS=sa.json
java -Dconfig.file=google.conf -jar cromwell-52.jar run hello.wdl -i hello.inputs

But I get the following error:

[2020-07-27 18:34:00,37] [error] PipelinesApiAsyncBackendJobExecutionActor [3d2d7a27wf_hello.hello:NA:1]: Error attempting to Execute
cromwell.engine.io.IoAttempts$EnhancedCromwellIoException: [Attempted 1 time(s)] - StorageException: xxx@xxx.iam.gserviceaccount.com does not have serviceusage.services.use access to the Google Cloud project.
Caused by: com.google.cloud.storage.StorageException: xxx@xxx.iam.gserviceaccount.com does not have serviceusage.services.use access to the Google Cloud project.
    at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:227)
    at com.google.cloud.storage.spi.v1.HttpStorageRpc.create(HttpStorageRpc.java:308)
    at com.google.cloud.storage.StorageImpl$3.call(StorageImpl.java:213)
    at com.google.cloud.storage.StorageImpl$3.call(StorageImpl.java:210)
    at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
    at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
    at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
    at com.google.cloud.storage.StorageImpl.internalCreate(StorageImpl.java:209)
    at com.google.cloud.storage.StorageImpl.create(StorageImpl.java:171)
    at cromwell.filesystems.gcs.GcsPath.request$1(GcsPathBuilder.scala:196)
    at cromwell.filesystems.gcs.GcsPath.$anonfun$writeContent$2(GcsPathBuilder.scala:203)
    at cromwell.filesystems.gcs.GcsPath.$anonfun$writeContent$2$adapted(GcsPathBuilder.scala:203)
    at cromwell.filesystems.gcs.GcsEnhancedRequest$.$anonfun$recoverFromProjectNotProvided$3(GcsEnhancedRequest.scala:18)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:87)
    at cats.effect.internals.IORunLoop$RestartCallback.signal(IORunLoop.scala:355)
    at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:376)
    at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:316)
    at cats.effect.internals.IOShift$Tick.run(IOShift.scala:36)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
    at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
POST https://storage.googleapis.com/upload/storage/v1/b/xxx/o?projection=full&userProject=xxx&uploadType=multipart
{
  "code" : 403,
  "errors" : [ {
    "domain" : "global",
    "message" : "xxx@xxx.iam.gserviceaccount.com does not have serviceusage.services.use access to the Google Cloud project.",
    "reason" : "forbidden"
  } ],
  "message" : "xxx@xxx.iam.gserviceaccount.com does not have serviceusage.services.use access to the Google Cloud project."
}
    at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:555)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:475)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:592)
    at com.google.cloud.storage.spi.v1.HttpStorageRpc.create(HttpStorageRpc.java:305)
    ... 22 common frames omitted
[2020-07-27 18:34:01,11] [info] WorkflowManagerActor Workflow 3d2d7a27-7c37-42c7-8c96-de7efef896e3 failed (during ExecutingWorkflowState): cromwell.engine.io.IoAttempts$EnhancedCromwellIoException: [Attempted 1 time(s)] - StorageException: xxx@xxx.iam.gserviceaccount.com does not have serviceusage.services.use access to the Google Cloud project.
Caused by: com.google.cloud.storage.StorageException: xxx@xxx.iam.gserviceaccount.com does not have serviceusage.services.use access to the Google Cloud project.
    at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:227)
    at com.google.cloud.storage.spi.v1.HttpStorageRpc.create(HttpStorageRpc.java:308)
    at com.google.cloud.storage.StorageImpl$3.call(StorageImpl.java:213)
    at com.google.cloud.storage.StorageImpl$3.call(StorageImpl.java:210)
    at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
    at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
    at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
    at com.google.cloud.storage.StorageImpl.internalCreate(StorageImpl.java:209)
    at com.google.cloud.storage.StorageImpl.create(StorageImpl.java:171)
    at cromwell.filesystems.gcs.GcsPath.request$1(GcsPathBuilder.scala:196)
    at cromwell.filesystems.gcs.GcsPath.$anonfun$writeContent$2(GcsPathBuilder.scala:203)
    at cromwell.filesystems.gcs.GcsPath.$anonfun$writeContent$2$adapted(GcsPathBuilder.scala:203)
    at cromwell.filesystems.gcs.GcsEnhancedRequest$.$anonfun$recoverFromProjectNotProvided$3(GcsEnhancedRequest.scala:18)
    at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:87)
    at cats.effect.internals.IORunLoop$RestartCallback.signal(IORunLoop.scala:355)
    at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:376)
    at cats.effect.internals.IORunLoop$RestartCallback.apply(IORunLoop.scala:316)
    at cats.effect.internals.IOShift$Tick.run(IOShift.scala:36)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
    at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
POST https://storage.googleapis.com/upload/storage/v1/b/xxx/o?projection=full&userProject=xxx&uploadType=multipart
{
  "code" : 403,
  "errors" : [ {
    "domain" : "global",
    "message" : "xxx@xxx.iam.gserviceaccount.com does not have serviceusage.services.use access to the Google Cloud project.",
    "reason" : "forbidden"
  } ],
  "message" : "xxx@xxx.iam.gserviceaccount.com does not have serviceusage.services.use access to the Google Cloud project."
}
    at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
    at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:555)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:475)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:592)
    at com.google.cloud.storage.spi.v1.HttpStorageRpc.create(HttpStorageRpc.java:305)
    ... 22 more

I have no idea what serviceusage.services.use is and how to activate it. The tutorial is also very weirdly written. It seems like there used to be a tutorial about JES/PAPIv1 and then it got updated with a notice for PAPIv2. It is completely unclear whether the user is supposed to use JES/PAPIv1 or PAPIv2.

I do not understand why it has to be so complicated. Can Cromwell provide some useful message about what to do to set the required permissions?

freeseek commented 4 years ago

My admin has added the policy serviceusage.services.use to my Service Account, whatever that means (I have no idea). Now I get this error:

[2020-07-27 19:13:48,68] [error] PipelinesApiAsyncBackendJobExecutionActor [bf8fa2c2wf_hello.hello:NA:1]: Error attempting to Execute
java.io.IOException: Scopes not configured for service account. Scoped should be specified by calling createScoped or passing scopes to constructor.
    at com.google.auth.oauth2.ServiceAccountCredentials.refreshAccessToken(ServiceAccountCredentials.java:402)
    at com.google.auth.oauth2.OAuth2Credentials.refresh(OAuth2Credentials.java:157)
    at com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:145)
    at com.google.auth.oauth2.ServiceAccountCredentials.getRequestMetadata(ServiceAccountCredentials.java:603)
    at com.google.auth.http.HttpCredentialsAdapter.initialize(HttpCredentialsAdapter.java:91)
    at com.google.api.client.http.HttpRequestFactory.buildRequest(HttpRequestFactory.java:88)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.buildHttpRequest(AbstractGoogleClientRequest.java:423)
    at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.buildHttpRequest(AbstractGoogleClientRequest.java:399)
    at cromwell.backend.google.pipelines.v1alpha2.GenomicsFactory$$anon$1.runRequest(GenomicsFactory.scala:85)
    at cromwell.backend.google.pipelines.common.api.clients.PipelinesApiRunCreationClient.runPipeline(PipelinesApiRunCreationClient.scala:53)
    at cromwell.backend.google.pipelines.common.api.clients.PipelinesApiRunCreationClient.runPipeline$(PipelinesApiRunCreationClient.scala:48)
    at cromwell.backend.google.pipelines.common.PipelinesApiAsyncBackendJobExecutionActor.runPipeline(PipelinesApiAsyncBackendJobExecutionActor.scala:92)
    at cromwell.backend.google.pipelines.common.PipelinesApiAsyncBackendJobExecutionActor.$anonfun$createNewJob$19(PipelinesApiAsyncBackendJobExecutionActor.scala:572)
    at scala.concurrent.Future.$anonfun$flatMap$1(Future.scala:307)
    at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:41)
    at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
    at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
    at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
    at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92)
    at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
    at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

And instead of terminating immediately, I keep getting the same error multiple times.

aednichols commented 4 years ago

Hi, it looks like https://github.com/broadinstitute/cromwell/issues/3690 may be helpful. Thank you for your interest in Cromwell.

freeseek commented 4 years ago

I still do not understand.

aednichols commented 4 years ago

Hi @freeseek,

(replying here because this is the currently open issue)

The message

400 Bucket is requester pays bucket but no user project provided.

should be addressed by specifying your project in the config as described here.

aednichols commented 4 years ago

As for

PAPI error code 7. Required 'compute.zones.list' permission for 'projects/xxx'

it sounds like you need to access Google Cloud Console and enable this permission for your project (Cromwell cannot perform this step for you automatically)

freeseek commented 4 years ago

The problem is that when I started to use Cromwell I was under the impression that I would get shielded from having to learn all the complexities of a given backend and Cromwell would take care of it. I have zero familiarity with permissions in Google Cloud. The admin that set up my account also had no idea what compute.zones.list mean. I don't see this in the tutorial. The following command does not work:

$ gcloud projects add-iam-policy-binding xxx --member serviceAccount:xxx@xxx.gserviceaccount.com --role roles/compute.zones.list
ERROR: Policy modification failed. For a binding with condition, run "gcloud alpha iam policies lint-condition" to identify issues in condition.
ERROR: (gcloud.projects.add-iam-policy-binding) INVALID_ARGUMENT: Role roles/compute.zones.list is not supported for this resource.

I have no idea what I should do. Why can't Cromwell simply provide the command line needed to change the permission?

As for Requester Pays, following the documentation I have set up the project field in the gcs filesystem configuration (completely unclear which one in the documentation, as according to the tutorial there are two, but I have included project in both ...) in the configuration file as follows:

include required(classpath("application"))

google {
  application-name = "cromwell"
  auths = [
    {
      name = "application-default"
      scheme = "application_default"
    }
  ]
}

engine {
  filesystems {
    gcs {
      auth = "application-default"
      project = "xxx"
    }
  }
}

backend {
  default = "JES"
  providers {
    JES {
      actor-factory = "cromwell.backend.impl.jes.JesBackendLifecycleActorFactory"
      config {
        // Google project
        project = "xxx"

        // Base bucket for workflow executions
        root = "gs://xxx/cromwell-execution"

        // Polling for completion backs-off gradually for slower-running jobs.
        // This is the maximum polling interval (in seconds):
        maximum-polling-interval = 600

        // Optional Dockerhub Credentials. Can be used to access private docker images.
        dockerhub {
          // account = ""
          // token = ""
        }

        genomics {
          // A reference to an auth defined in the `google` stanza at the top.  This auth is used to create
          // Pipelines and manipulate auth JSONs.
          auth = "application-default"
          // Endpoint for APIs, no reason to change this unless directed by Google.
          endpoint-url = "https://genomics.googleapis.com/"
          // This allows you to use an alternative service account to launch jobs, by default uses default service account
          compute-service-account = "default"

          // Pipelines v2 only: specify the number of times localization and delocalization operations should be attempted
          // There is no logic to determine if the error was transient or not, everything is retried upon failure
          // Defaults to 3
          localization-attempts = 3
        }

        filesystems {
          gcs {
            // A reference to a potentially different auth for manipulating files via engine functions.
            auth = "application-default"
            project = "xxx"
          }
        }
      }
    }
  }
}

I then run with the command:

java -Dconfig.file=google.conf -jar cromwell-52.jar run hello.wdl -i hello.inputs

And I get the error:

[2020-07-28 16:01:35,86] [info] WorkflowManagerActor Workflow 28f84555-6e06-41be-891b-84de0f35ee74 failed (during ExecutingWorkflowState): java.lang.Exception: Task wf_hello.hello:NA:1 failed. The job was stopped before the command finished. PAPI error code 10. 15: Gsutil failed: failed to upload logs for "gs://xxx/cromwell-execution/wf_hello/28f84555-6e06-41be-891b-84de0f35ee74/call-hello/": cp failed: gsutil -h Content-type:text/plain -q -m cp /var/log/google-genomics/*.log gs://xxx/cromwell-execution/wf_hello/28f84555-6e06-41be-891b-84de0f35ee74/call-hello/, command failed: BadRequestException: 400 Bucket is requester pays bucket but no user project provided.

Is this because Pipelines API version 1 does not support buckets with requester pays? If so, why cannot Cromwell just say so? Notice that the tutorial does not say that requester pays does not work with Pipelines API version 1, it says instead more information for Requester Pays can be found at: [Requester Pays](https://cloud.google.com/storage/docs/requester-pays)

In any case, I have removed the Requester Pays option from the bucket, as I pretty much given up on that. I was then able to run the hello.wdl workflow fine using the configuration file above. I tried to run the mutect2.wdl workflow and then I have encountered a new issue when trying to localize a file in a bucket for which I have permissions to read without problems using my Google account. The error contained the following:

command failed: AccessDeniedException: 403 xxx@xxx.gserviceaccount.com does not have storage.objects.list access to the Google Cloud Storage bucket.

I have tried to fix that as follows:

$ gcloud projects add-iam-policy-binding xxx --member serviceAccount:xxx@xxx.gserviceaccount.com --role roles/storage.objects.list
ERROR: Policy modification failed. For a binding with condition, run "gcloud alpha iam policies lint-condition" to identify issues in condition.
ERROR: (gcloud.projects.add-iam-policy-binding) INVALID_ARGUMENT: Role roles/storage.objects.list is not supported for this resource.

No luck.

aednichols commented 4 years ago

Is this because Pipelines API version 1 does not support buckets with requester pays? If so, why cannot Cromwell just say so?

Granted it's not in the error message itself, but the page I linked states

Pipelines API version 1 does not support buckets with requester pays, so while Cromwell itself might be able to access bucket with RP, jobs running on Pipelines API V1 with file inputs and / or outputs will not work.

Pipelines API v1 is deprecated by Google and documentation for it is not maintained; new projects should always use v2


As for the gcloud issue I've never done this particular operation personally, but I suspect you may have luck looking at the GCP docs or Stack Overflow.

You could opt for Terra which is basically a fully managed version of Cromwell (it configures Cromwell and all of this project stuff for you)

Hope this helps.

freeseek commented 4 years ago

However the tutorial, right in the section describing the configuration file for PAPIv1, neither states this simple fact about Requester Pays not working with PAPIv1 nor links to the useful page you mentioned.

I have now switched to the PAPIv2.conf configuration file which does not contain the important piece of configuration code:

engine {
  filesystems {
    gcs {
      auth = "application-default"
      project = "<google-billing-project-id>"
    }
  }
}

This was in the google.conf PAPIv1 configuration file. I guess somehow it did not make it in the PAPIv2 configuration file and users reading the tutorial have the guess that on their own. Now the Requester Pays issue is gone as I get lines like this in the logs instead:

2020/07/28 21:30:48 rm -f $HOME/.config/gcloud/gce && gsutil  -h "Content-Type: text/plain; charset=UTF-8" cp /google/logs/output gs://xxx/Mutect2/74c8be5e-f988-49b0-a51d-c87f2ac7cb60/call-TumorCramToBam/TumorCramToBam.log failed
BadRequestException: 400 Bucket is requester pays bucket but no user project provided.
2020/07/28 21:30:48 Retrying with user project
Copying file:///google/logs/output [Content-Type=text/plain; charset=UTF-8]...

At least that's fully clarified.

However I still get the error:

2020/07/28 21:30:43 Localizing input gs://fc-118c254f-010a-4ee6-b149-6f0bb5abaa77/GeneticNeuroscience_McCarroll_CIRM_GRU_Exome_9qCN-LOH_PDO-21129/RP-1875/Exome/CW60141_P13_MT_1-19-18/v1/CW60141_P13_MT_1-19-18.cram -> /cromwell_root/fc-118c254f-010a-4ee6-b149-6f0bb5abaa77/GeneticNeuroscience_McCarroll_CIRM_GRU_Exome_9qCN-LOH_PDO-21129/RP-1875/Exome/CW60141_P13_MT_1-19-18/v1/CW60141_P13_MT_1-19-18.cram
Error attempting to localize file with command: 'mkdir -p '/cromwell_root/fc-118c254f-010a-4ee6-b149-6f0bb5abaa77/GeneticNeuroscience_McCarroll_CIRM_GRU_Exome_9qCN-LOH_PDO-21129/RP-1875/Exome/CW60141_P13_MT_1-19-18/v1/' && rm -f /root/.config/gcloud/gce && gsutil -o 'GSUtil:parallel_thread_count=1' -o 'GSUtil:sliced_object_download_max_components=1' cp 'gs://fc-118c254f-010a-4ee6-b149-6f0bb5abaa77/GeneticNeuroscience_McCarroll_CIRM_GRU_Exome_9qCN-LOH_PDO-21129/RP-1875/Exome/CW60141_P13_MT_1-19-18/v1/CW60141_P13_MT_1-19-18.cram' '/cromwell_root/fc-118c254f-010a-4ee6-b149-6f0bb5abaa77/GeneticNeuroscience_McCarroll_CIRM_GRU_Exome_9qCN-LOH_PDO-21129/RP-1875/Exome/CW60141_P13_MT_1-19-18/v1/''
AccessDeniedException: 403 xxx@xxx.gserviceaccount.com does not have storage.objects.list access to the Google Cloud Storage bucket.

I am starting to guess that this is a settings issue with the bucket, not with my service account. My best guess is that, albeit extremely counter-intuitive, I have access to this bucket with my personal account but I do not have access to this bucket with my service account. Oh my, this is so complicated ...

As for Terra, I have used it quite a bit for the last week but, and I am not alone in saying this, Terra is not a good environment for development of new WDLs. For example, just to upload a new WDL for testing it takes so many steps. Maybe once you have a polished WDL it is great for users with less technical expertise. But I want to use Cromwell to develop and test new WDLs.

freeseek commented 4 years ago

I will also add that the storage.objects.list issue was mentioned before as well (#1960). But no solution was provided.

aednichols commented 4 years ago

the tutorial, right in the section describing the configuration file for PAPIv1, neither states this simple fact about Requester Pays not working with PAPIv1

We should probably remove the PAPIv1 tutorial entirely, it has carried the deprecation warning for over a year now. It lives here and we will gladly merge improvement PRs.

My best guess is that, albeit extremely counter-intuitive, I have access to this bucket with my personal account but I do not have access to this bucket with my service account. Oh my, this is so complicated ...

That does seem like a probable explanation, though I don't know the particulars of how you set up your SA. Cloud architecture is a large beast and Cromwell targets a very specific cross section of it (running workflows). A particular account having access to input data would need to be configured as a prerequisite. Since I see you are at Broad, perhaps BITS can help with it.

storage.objects.list issue

I recommend trying to recreate the scenario locally with gsutil cp and the desired service account & file. Your turnaround time will be much faster than running the workflow.

It is certainly possible that Cromwell has a bug that causes GCS to incorrectly deny access, but we generally would like to see the same file/account combination working correctly outside of Cromwell before we will accept it as a bug report.

aednichols commented 4 years ago

@freeseek I am signing off for the day, if you have any further questions please tag my colleague @mcovarr who is currently on issue triage.

freeseek commented 4 years ago

@aednichols thank you. If I find an explanation I will report it on both posts.

freeseek commented 4 years ago

This is a way to reproduce with gsutil cp, as @aednichols suggested (in my case Cromwell runs with service account 30148356615-compute@developer.gserviceaccount.com):

$ gcloud config set account giulio@broadinstitute.org
Updated property [core/account].
$ gcloud auth list
                  Credentialed Accounts
ACTIVE  ACCOUNT
        30148356615-compute@developer.gserviceaccount.com
        giulio.genovese@gmail.com
*       giulio@broadinstitute.org

To set the active account, run:
    $ gcloud config set account `ACCOUNT`

$ gsutil cp gs://fc-118c254f-010a-4ee6-b149-6f0bb5abaa77/GeneticNeuroscience_McCarroll_CIRM_GRU_Exome_9qCN-LOH_PDO-21129/RP-1875/Exome/CW60141_P13_MT_1-19-18/v1/CW60141_P13_MT_1-19-18.cram.crai /tmp/

Copying gs://fc-118c254f-010a-4ee6-b149-6f0bb5abaa77/GeneticNeuroscience_McCarroll_CIRM_GRU_Exome_9qCN-LOH_PDO-21129/RP-1875/Exome/CW60141_P13_MT_1-19-18/v1/CW60141_P13_MT_1-19-18.cram.crai...
/ [1 files][143.2 KiB/143.2 KiB]
Operation completed over 1 objects/143.2 KiB.
$ gcloud config set account 30148356615-compute@developer.gserviceaccount.com
Updated property [core/account].
$ gcloud auth list
                  Credentialed Accounts
ACTIVE  ACCOUNT
*       30148356615-compute@developer.gserviceaccount.com
        giulio.genovese@gmail.com
        giulio@broadinstitute.org

To set the active account, run:
    $ gcloud config set account `ACCOUNT`

$ gsutil cp gs://fc-118c254f-010a-4ee6-b149-6f0bb5abaa77/GeneticNeuroscience_McCarroll_CIRM_GRU_Exome_9qCN-LOH_PDO-21129/RP-1875/Exome/CW60141_P13_MT_1-19-18/v1/CW60141_P13_MT_1-19-18.cram.crai /tmp/
AccessDeniedException: 403 30148356615-compute@developer.gserviceaccount.com does not have storage.objects.list access to the Google Cloud Storage bucket.

So in this case the more appropriate questions would be: 1) How do I get to have my service account 30148356615-compute@developer.gserviceaccount.com have the same permissions as my personal account giulio@broadinstitute.org? 2) How do I get Cromwell to run with my personal account giulio@broadinstitute.org instead of my service account?

freeseek commented 4 years ago

Since this post was about the tutorial being quite broken, here a few points:

1) The following code in the permissions section:

#Create a new service account called "MyServiceAccount", and from the output of the command, take the email address that was generated
EMAIL=$(gcloud beta iam service-accounts create MyServiceAccount --description "to run cromwell"  --display-name "cromwell service account" --format json | jq '.email' | sed -e 's/\"//g')

does not work. It errors out with:

ERROR: (gcloud.beta.iam.service-accounts.create) argument NAME: Bad value [MyServiceAccount]: Service account name must be between 6 and 30 characters (inclusive), must begin with a lowercase letter, and consist of lowercase alphanumeric characters that can be separated by hyphens.

I believe MyServiceAccount needs to change to my-service-account (similarly to how it is used here for scheme = "service_account")

2) The following code in the permissions section:

# add all the roles to the service account
for i in storage.objectCreator storage.objectViewer genomics.pipelinesRunner genomics.admin iam.serviceAccountUser storage.objects.create
do
    gcloud projects add-iam-policy-binding MY-GOOGLE-PROJECT --member serviceAccount:"$EMAIL" --role roles/$i
done

does not work. When trying to add role storage.objects.create it errors out with:

ERROR: Policy modification failed. For a binding with condition, run "gcloud alpha iam policies lint-condition" to identify issues in condition.
ERROR: (gcloud.projects.add-iam-policy-binding) INVALID_ARGUMENT: Role roles/storage.objects.create is not supported for this resource.

and there is clearly an extra role missing as roles storage.objectCreator, storage.objectViewer, genomics.pipelinesRunner, genomics.admin, iam.serviceAccountUser (corresponding to roles Storage Object Creator, Storage Object Viewer, Genomics Pipelines Runner, Genomics Admin, Service Account User) are not sufficient to create files inside Google buckets.

3) The permissions section guides the user into creating a new service account under the current project. This would need to be selected in the configuration file with an authorization with scheme = "service_account" but instead both the configuration file for PAPIv2 and the configuration file for PAPIv1 are configured to use an authorization with scheme = "application_default".

I find it very hard to believe that any novel user could go through the tutorial and successfully set up a Cromwell server.

On a slightly different note, some of my issues would be resolved if I could run jobs using my user account rather than a service account associated with my project. In the Google backends section of the docs there is a lonely mention of the scheme = "user_account" but no further explanation. According to the source code it should be defined as:

{
  name = "user-account"
  scheme = "user_account"
  user = "me"
  secrets-file = "/very/secret/file.txt"
  data-store-dir = "/where/the/data/at"
}

But I was not able to get it to work.

freeseek commented 4 years ago

As explained in issue #4304 now, it seems like the following three roles are required to run Cromwell with the PAPIv2 backend:

  1. Cloud Life Sciences Workflows Runner (lifesciences.workflowsRunner)
  2. Service Account User (iam.serviceAccountUser)
  3. Storage Object Admin (storage.objectAdmin)

Rather than the roles storage.objectCreator storage.objectViewer genomics.pipelinesRunner genomics.admin iam.serviceAccountUser storage.objects.create as explained in the tutorial.