gagneurlab / MMSplice_MTSplice

Tissue-specific variant effect predictions on splicing
MIT License
39 stars 21 forks source link

vep plugin error #1

Closed yipukangda closed 5 years ago

yipukangda commented 5 years ago

Description

I run mmsplice with vep plugin script but get "Can't locate DBD/mysql.pm" error The error massage is blow:

install_driver(mysql) failed: Can't locate DBD/mysql.pm in @INC (you may need to install the DBD::mysql module) (@INC contains: /home/bio/perl5/lib/perl5/ .) at (eval 37) line 3. Perhaps the DBD::mysql perl module hasn't been fully installed, or perhaps the capitalisation of 'mysql' isn't right. Available drivers: DBM, ExampleP, File, Gofer, Mem, Proxy, Sponge. at /home/bio/bioapps/ensembl-vep/Bio/EnsEMBL/Registry.pm line 1769.

I try to install DBD::mysql module with cpan but replied with this error massage:

fatal error: xlocale.h: No such file or directory

What I Did

./vep -i vcf_file.vcf --plugin MMSplice --vcf --force --assembly GRCh37 --cache --port 3337

cpan DBD::mysql
s6juncheng commented 5 years ago

Hi @yipukangda, thanks for trying MMSplice. First of all, MMSplice is developed and tested in python3. I'm not sure whether this caused the perl error, but will investigate it. Have you tried to use MMSplice with python directly?

If you prefer command line tool, you can also use MMSplice from kipoi. Check code example here: http://kipoi.org/models/MMSplice/deltaLogitPSI/. Installation of kipoi can be found here: http://kipoi.org/docs/#installation.

MuhammedHasan commented 5 years ago

Hi @yipukangda, thanks for using our Vep plugin. Can you test your Vep installation with the following comment?

./vep -i examples/danio_rerio_GRCz10.vcf -o out.txt --database

You can find the vep test file from test.vcf

If this command fails, It looks like that you have an environmental vep installation issue on Ubuntu 18. Please check a similar question in stackoverflow

Also, vep documentation recommend usage of cpanm rather than cpan. So you may try to install peer-dependencies of vep with the following comments:

sudo apt-get install mysql-server
cpanm DBI
cpanm DBD::mysql
yipukangda commented 5 years ago

@s6juncheng @MuhammedHasan Thanks for help, I tried as suggested but still not work, I guess there may be some thing incompatible in environment, so I try in docker next.

yipukangda commented 5 years ago

@s6juncheng @MuhammedHasan I build mmsplice to a docker container, it works now

In addition, it seems a bug in vep plugin, I think it should be change from

$self->{api_pid} = open3(my $python_stdin, my $python_stdout,  my $python_stderr, "mmsplice run");

to

$self->{api_pid} = open3(my $python_stdin, my $python_stdout,  my $python_stderr, "mmsplice run-api");
MuhammedHasan commented 5 years ago

Great that it worked. Thanks for reporting the bug. We recently have some updates both on the python package and VEP plugin, they are not fully compatible at the moment with the new versions. We are working on it, and likely to have both updated today. Will let you know by then. Also, we plan to release docker container in docker-hub soon.

s6juncheng commented 5 years ago

Hi @yipukangda, we released a new version of mmsplice on pypi, on which the current perl plugin in this repo was build. You can update mmsplice python package with pip install -U mmsplice, and maybe use the latest perl plugin. Let us know how it worked, thanks.

yipukangda commented 5 years ago

@s6juncheng Thanks, I will try it soon.

yipukangda commented 5 years ago

Hi @s6juncheng , VEP works with mmplice and without error now, excellent! By the way, how to judge the scores (mmsplice_alt_acceptor, mmsplice_alt_acceptor_intron etc.), is it the Z-score like spidex predict score? Thanks.

s6juncheng commented 5 years ago

Hi @yipukangda. The primary score one should look at is mmsplice_delta_logit_psi. This is a prodiction of logit(Psi_alt) - logit(Psi_ref). If you have reference Psi (Psi of the reference sequence), you can get the predicted Psi of the alternative sequence as: Psi_alt = sigmoid(logit(Psi_ref) + mmsplice_delta_logit_psi).

If you don't have a rough estimation of the reference Psi, it depends on the use case. If you want to find variants with large effect on splicing, a cutoff at abs(mmsplice_delta_logit_psi)=1.5 is a good start. Larger abs values have larger predicted effect.

yipukangda commented 5 years ago

Hi @s6juncheng Thanks so much, Psi_ref need to be computed additionally, right? I will try 1.5 cutoff first.

s6juncheng commented 5 years ago

Hi @yipukangda, exactly, you would need to compute Psi_ref additionally based on split reads.

yipukangda commented 5 years ago

@s6juncheng OK, no more questions, thanks for the help.

yipukangda commented 5 years ago

@s6juncheng Hi, I calculate all possible SNV score of all refseq genes(15 bp up and down stream of exon-intron boundary) using mmsplice vep-plugin , but only 203 line match abs(mmsplice_delta_logit_psi)>1.5 criteria, and the describe statistical seems weird, too

count    2.670806e+07
mean    -4.484914e-02
std      2.225780e-01
min     -1.773251e+00
25%     -1.446264e-01
50%     -5.017973e-02
75%      5.274901e-02
max      9.973626e-01
Name: mmsplice_delta_logit_psi

which shows no result larger than 1 and very few less than -1.5, I test with few hundreds of canonical splicing variants tagged with DM in HGMD database and shows no significant difference with the whole score distribution, so I do not know what to do and ask for your help.

s6juncheng commented 5 years ago

Hi @yipukangda, I will test with some examples as you described. Will let you know soon.

s6juncheng commented 5 years ago

Hi @yipukangda. I'm not able to reproduce the issue as you described, can you please send me a very small test variant file? chengju@in.tum.de

yipukangda commented 5 years ago

@s6juncheng Hi, an email has been sent.

s6juncheng commented 5 years ago

This happened because --port was not given from the user command, thus the vep plugin was not querying predictions from the python package. We will give a corresponding update, thanks. See #5

MuhammedHasan commented 5 years ago

@yipukangda please see https://github.com/Ensembl/ensembl-vep/issues/337

@s6juncheng Hi, I calculate all possible SNV score of all refseq genes(15 bp up and down stream of exon-intron boundary) using mmsplice vep-plugin , but only 203 line match abs(mmsplice_delta_logit_psi)>1.5 criteria, and the describe statistical seems weird, too

count    2.670806e+07
mean    -4.484914e-02
std      2.225780e-01
min     -1.773251e+00
25%     -1.446264e-01
50%     -5.017973e-02
75%      5.274901e-02
max      9.973626e-01
Name: mmsplice_delta_logit_psi

which shows no result larger than 1 and very few less than -1.5, I test with few hundreds of canonical splicing variants tagged with DM in HGMD database and shows no significant difference with the whole score distribution, so I do not know what to do and ask for your help.