Using Hail with dsub - Githubissues

DataBiosphere / dsub

Open-source command-line tool to run batch computing tasks and workflows on backend services such as Google Cloud.

Apache License 2.0

261 stars 43 forks source link

Hi all, I am trying to use hail via dsub to extract a subset of variants in All of Us server. I think this is the most relevant image I can use https://github.com/DataBiosphere/terra-docker/tree/master/terra-jupyter-hail

But it result in error that pyspark is not found. I tried to install pyspark from https://dlcdn.apache.org/spark/spark-3.1.3/spark-3.1.3-bin-hadoop3.tgz. Now it says No FileSystem for scheme "gs".

May I ask do you have any idea how to use hail via dsub? Your help is really appreciated!

DataBiosphere / dsub