broadinstitute / gnomad_methods

Hail helper functions for the gnomAD project and Translational Genomics Group
https://gnomad.broadinstitute.org
MIT License
89 stars 29 forks source link

Exomes hail table not in public release? #670

Closed allisoncheney closed 9 months ago

allisoncheney commented 10 months ago

Hello, I was following the code to import the latest exomes release as written here

# Load the v4.0 exomes public release HT
from gnomad.resources.grch38.gnomad import public_release
ht = public_release(“exomes”).ht()

But was unable to.

---------------------------------------------------------------------------
DataException                             Traceback (most recent call last)
Cell In[63], line 2
      1 from gnomad.resources.grch38.gnomad import public_release
----> 2 ht = public_release("exomes").ht()
      3 ht.describe()

File /opt/conda/lib/python3.10/site-packages/gnomad/resources/grch38/gnomad.py:272, in public_release(data_type)
    265 """
    266 Retrieve publicly released versioned table resource.
    267 
    268 :param data_type: One of "exomes" or "genomes"
    269 :return: Release Table
    270 """
    271 if data_type not in DATA_TYPES:
--> 272     raise DataException(
    273         f"{data_type} not in {DATA_TYPES}, please select a data type from"
    274         f" {DATA_TYPES}"
    275     )
    277 if data_type == "exomes":
    278     current_release = CURRENT_EXOME_RELEASE

DataException: exomes not in ['genomes'], please select a data type from ['genomes']

I'm a bit confused by that error message, is the exome hail table not available? The genomes table did seem to work. I am using the hail environment on Terra: hail 0.2.126, python 3.10.12

mike-w-wilson commented 10 months ago

Hi @allisoncheney,

Thank you for the details with this error. Could you try pulling the most recent version of the repo, or updating to the most recent version of the gnomad package, to see if this resolves your issue? "exomes" is within the DATA_TYPES variables so this should work.

allisoncheney commented 10 months ago

Thank you! I didn't realize I was using an older version of gnomad. Updating allowed me to use get the v4 exomes table. But I realized that

from gnomad.resources.grch38.gnomad import public_release
genomes = public_release("genomes").ht()

gives me v3 genomes, not v4. (At least, it has annotations that match v3 like "primate_ai") Is there a way to select v4?

klaricch commented 10 months ago

@allisoncheney, can you pull the most recent version of the repo again? We just put in a fix yesterday and public_release("genomes") now defaults to load the v4 genomes.

ch-kr commented 9 months ago

Hi @allisoncheney -- thanks for flagging that the default genomes release version was not v4. As Kristen said above, we have fixed this in the most recent version of the code, so I am going to close this ticket; please open another ticket if you still encounter an issue.