atgu / hgdp_tgp

MIT License
32 stars 5 forks source link

error while using PCA and Ancestry Analyses #7

Closed SiddhiJani closed 1 year ago

SiddhiJani commented 1 year ago

while using this mt = hl.read_matrix_table(post_qc_path) getting error as below:

Traceback (most recent call last): File "", line 1, in File "", line 2, in read_matrix_table File "/home/cbr/.local/lib/python3.8/site-packages/hail/typecheck/check.py", line 577, in wrapper return originalfunc(*args, **kwargs_) File "/home/cbr/.local/lib/python3.8/site-packages/hail/methods/impex.py", line 2473, in read_matrix_table for rg_config in Env.backend().load_references_from_dataset(path): File "/home/cbr/.local/lib/python3.8/site-packages/hail/backend/spark_backend.py", line 324, in load_references_from_dataset return json.loads(self.hail_package().variant.ReferenceGenome.fromHailDataset(self.fs._jfs, path)) File "/home/cbr/.local/lib/python3.8/site-packages/py4j/java_gateway.py", line 1304, in call__ return_value = get_return_value( File "/home/cbr/.local/lib/python3.8/site-packages/hail/backend/py4j_backend.py", line 31, in deco raise fatal_error_from_java_error_triplet(deepest, full, error_id) from None hail.utils.java.FatalError: UnsupportedFileSystemException: No FileSystem for scheme "gs"

Java stack trace: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "gs" at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3281) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3301) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at is.hail.io.fs.HadoopFS.fileStatus(HadoopFS.scala:173) at is.hail.io.fs.FS.isDir(FS.scala:396) at is.hail.io.fs.FS.isDir$(FS.scala:394) at is.hail.io.fs.HadoopFS.isDir(HadoopFS.scala:72) at is.hail.expr.ir.RelationalSpec$.readMetadata(AbstractMatrixTableSpec.scala:31) at is.hail.expr.ir.RelationalSpec$.readReferences(AbstractMatrixTableSpec.scala:74) at is.hail.variant.ReferenceGenome$.fromHailDataset(ReferenceGenome.scala:581) at is.hail.variant.ReferenceGenome.fromHailDataset(ReferenceGenome.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:750)

Hail version: 0.2.108-fc03e9d5dc08 Error summary: UnsupportedFileSystemException: No FileSystem for scheme "gs"

I have installed latest hail but still getting this error. Kindly help

z-koenig commented 1 year ago

Do you have the gcs connector installed? That can cause this issue. This page from the hail docs on reading in data from google cloud also shows how it can be downloaded.

Additionally, I would add that currently our tutorial datasets are not in a public bucket, and thus are not available at this time. We are currently in the process of moving over our datasets from our private bucket to the public one.

We will be updating the readme with additional information on how to access our tutorial datasets directly once our datasets have been moved over.

If you have any additional questions, please let me know!