konradjk / loftee

MIT License
174 stars 55 forks source link

GRCh38 branch does not work in the no root permission OS #111

Closed Nick-Tan-debug closed 6 days ago

Nick-Tan-debug commented 6 days ago

Hi friends @konradjk @oleraj,

LOFTEE is a great work! Thanks for your team's contribution.

I am trying to use LOFTEE with GRCh38 vcf.gz files and I meet a problem. I have tested with #51 and #58 but they don't work for me. I use HPC so I don't have root permission to do many things. Such as install Bio::DB::BigFile with kent source tree. https://useast.ensembl.org/info/docs/tools/vep/script/vep_download.html#bigfile

Instead I installed VEP and bigfile with bioconda. mamba install -c bioconda perl-dbd-sqlite mamba install -c bioconda perl-bio-bigfile mamba install -c bioconda samtools

The LoFtee version is download with GRCh38 version. https://github.com/konradjk/loftee/releases/tag/v1.0.4_GRCh38

The related files path is https://personal.broadinstitute.org/konradk/loftee_data/GRCh38/

The plugin part code as follow. --plugin LoF,loftee_path:/vep/cache/loftee-1.0.4/,conservation_file:/vep/cache/loftee-1.0.4/loftee.sql,human_ancestor_fa:/vep/cache/loftee-1.0.4/human_ancestor.fa.gz,gerp_bigwig:/vep/cache/loftee-1.0.4/gerp_conservation_scores.homo_sapiens.GRCh38.bw

In the summary.html the warning msg are as followed. WARNING: Plugin 'LoF' went wrong: Can't call method "execute" on an undefined value at /conda_env/vep113/share/ensembl-vep-113.2-0/LoF.pm line 566, <$fh> line 7343. WARNING: Plugin 'LoF' went wrong: Can't call method "execute" on an undefined value at /conda_env/vep113/share/ensembl-vep-113.2-0/gerp_dist.pl line 130, <$fh> line 7343.

The MSG in command line: DBD::SQLite::db prepare failed: no such table: gerp_exons at /conda_env/vep113/share/ensembl-vep-113.2-0/gerp_dist.pl line 129, <$fh> line 7343. DBD::SQLite::db prepare failed: no such table: phylocsf_data at /conda_env/vep113/share/ensembl-vep-113.2-0/LoF.pm line 565, <$fh> line 7343.

I have checked the table in loftee.sql using sqlite3. The table name is phylocsf_summary, and this is not the gerp_exons and phylocsf_data. I doubt if there some information I have missed between GRCh37 and GRCh38? Because the gerp_exons and phylocsf_data table are in the phylocsf_gerp.sql with GRCh37 branch. Or in other words, can I use phylocsf_gerp.sql instead loftee.sql for GRCh38 pipeline? https://personal.broadinstitute.org/konradk/loftee_data/GRCh37/

How should I sove the problems. Thanks for everyone's help!

Best, Nick

Nick-Tan-debug commented 6 days ago

Finally I found the problem. The ensemble-vep do not update the LoF.pm in the share folder. Due to this question I have used grch37 version LoF.pm for grch38 resource. Thanks everyone's help! #84 https://github.com/konradjk/loftee/issues/84#issuecomment-1025776382