KalinNonchev / gnomAD_DB

This package scales the huge gnomAD files to a SQLite database, which is easy and fast to query. It extracts from a gnomAD vcf the minor allele frequency for each variant.
MIT License
35 stars 10 forks source link

Output from get_info_from_str #15

Closed ichauchcc closed 2 years ago

ichauchcc commented 2 years ago

Hi, I am trying to run the function using my own data (db_38.get_info_from_str("12:132673597:C>T", "AC")), the output is like: Series([], Name: AC, dtype: object) When I run on the example data, it worked. Any suggestion on how I can fix it? Thanks so much!

KalinNonchev commented 2 years ago

Hello @ichauchcc,

Thank you for your question. Unfortunately, I was not able to reproduce your error.

from gnomad_db.database import gnomAD_DB
database_location = "test_dir"
db = gnomAD_DB(database_location, genome="Grch38")
db.get_info_from_str("12:132673597:C>T", "AC")

results in 65as in https://gnomad.broadinstitute.org/variant/12-132673597-C-T?dataset=gnomad_r3

1) Could you check if you have successfully downloaded Grch38 databasa from https://zenodo.org/record/5758663/files/gnomad_db_v3.1.1.sqlite3.gz?download=1? You can try this:

from gnomad_db.database import gnomAD_DB
download_link = "https://zenodo.org/record/5758663/files/gnomad_db_v3.1.1.sqlite3.gz?download=1"
output_dir = "test_dir" # database_location
gnomAD_DB.download_and_unzip(download_link, output_dir)
  1. Make sure that you are passing the correct folder as database_location, which contains the gnomad_db.sqlite3 file

Let me know if this works.

Best,

ichauchcc commented 2 years ago

Hey Kalin, Thank you for your quick response. Following your common, I reinstall the version 38 database and it worked. Amazing work!

KalinNonchev commented 2 years ago

Thank you for your feedback!