gagneurlab / MMSplice_MTSplice

Tissue-specific variant effect predictions on splicing
MIT License
39 stars 21 forks source link

error running example data - Input sequence acceptor intron length cannot be longer than the input sequence #40

Closed rhalperin closed 2 years ago

rhalperin commented 3 years ago

I am trying to run the example (as shown in the "Example Code" section of the README). I get an error on the prediction step:

>>> predictions = predict_all_table(model, dl, pathogenicity=True, splicing_efficiency=True)
0it [00:00, ?it/s][W::vcf_parse] INFO 'ALLELEID' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNDISDB' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNDN' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNHGVS' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNREVSTAT' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNSIG' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNVC' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNVCSO' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'GENEINFO' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'MC' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'ORIGIN' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'RS' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNVI' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'AF_EXAC' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNSIGCONF' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'AF_ESP' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'AF_TGP' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNDISDBINCL' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNDNINCL' is not defined in the header, assuming Type=String
[W::vcf_parse] INFO 'CLNSIGINCL' is not defined in the header, assuming Type=String
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rhalperin/.local/lib/python3.6/site-packages/mmsplice/mmsplice.py", line 277, in predict_all_table
    natural_scale=natural_scale, ref_psi_version=ref_psi_version)
  File "/home/rhalperin/.local/lib/python3.6/site-packages/mmsplice/mmsplice.py", line 225, in predict_on_dataloader
    natural_scale=natural_scale, ref_psi_version=ref_psi_version)
  File "/packages/python/3.6.0/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 255, in concat
    sort=sort,
  File "/packages/python/3.6.0/lib/python3.6/site-packages/pandas/core/reshape/concat.py", line 301, in __init__
    objs = list(objs)
  File "/home/rhalperin/.local/lib/python3.6/site-packages/mmsplice/mmsplice.py", line 151, in _predict_on_dataloader
    for batch in dt_iter:
  File "/home/rhalperin/.local/lib/python3.6/site-packages/tqdm/std.py", line 1130, in __iter__
    for obj in iterable:
  File "/home/rhalperin/.local/lib/python3.6/site-packages/mmsplice/exon_dataloader.py", line 278, in batch_iter
    for batch in super().batch_iter(batch_size, **kwargs):
  File "/home/rhalperin/.local/lib/python3.6/site-packages/kipoi_utils/data_utils.py", line 66, in batch_gen
    for x in iterable:
  File "/home/rhalperin/.local/lib/python3.6/site-packages/mmsplice/vcf_dataloader.py", line 142, in __next__
    return self._next(exon, variant, overhang)
  File "/home/rhalperin/.local/lib/python3.6/site-packages/mmsplice/exon_dataloader.py", line 257, in _next
    inputs['seq'] = self.spliter.split(inputs['seq'], overhang, exon)
  File "/home/rhalperin/.local/lib/python3.6/site-packages/mmsplice/exon_dataloader.py", line 109, in split
    assert intronl_len <= len(seq), "Input sequence acceptor intron" \
AssertionError: Input sequence acceptor intron length cannot be longer than the input sequence
MuhammedHasan commented 3 years ago

Dear @rhalperin,

Sorry for the late reply.

I cannot reproduce the error. Can you please share your dependencies with pip freeze?

Probably version of one of the dependency is wrong.

rhalperin commented 3 years ago

Here you go: bash-4.2$ pip freeze absl-py==0.9.0 aiohttp==3.6.2 aiosqlite3==0.3.0 alabaster==0.7.10 appdirs==1.4.3 argcomplete==1.12.0 argh==0.26.2 arrow==0.15.7 asn1crypto==0.24.0 astor==0.7.1 astropy==3.0.5 astunparse==1.6.3 async-timeout==3.0.1 attrs==19.3.0 Babel==2.5.3 bcrypt==3.1.4 binaryornot==0.4.4 biopython==1.74 bleach==1.5.0 boto3==1.6.2 botocore==1.9.2 cachetools==4.1.1 certifi==2018.8.24 cffi==1.11.5 chardet==3.0.4 click==7.1.2 colorlog==4.1.0 concise==0.6.9 ConfigArgParse==0.13.0 confuse==1.0.0 cookiecutter==1.7.2 cryptography==2.3.1 cutadapt==1.15 cycler==0.10.0 Cython==0.28.4 datacache==1.1.5 datrie==0.7.1 decorator==4.3.0 deepTools==3.0.0 deprecation==2.1.0 descartes==1.1.0 docutils==0.14 entrypoints==0.2.2 filelock==3.0.12 future==0.16.0 gast==0.3.3 gffutils==0.10.1 google-auth==1.19.2 google-auth-oauthlib==0.4.1 google-pasta==0.2.0 grpcio==1.16.0 gtfparse==1.2.0 h5py==2.10.0 html5lib==0.9999999 idna==2.6 idna-ssl==1.1.0 imagesize==1.0.0 importlib-metadata==1.7.0 infi.systray==0.1.12 intervaltree==3.0.2 ipykernel==4.5.2 ipython==5.2.2 ipython-genutils==0.1.0 ipywidgets==5.2.2 isovar==0.9.0 jeepney==0.4.1 jetstream==1.5 Jinja2==2.10 jinja2-time==0.2.0 jmespath==0.9.3 jsonschema==2.6.0 jupyter==1.0.0 jupyter-client==5.0.0 jupyter-console==5.1.0 jupyter-core==4.3.0 Keras==2.2.4 Keras-Applications==1.0.6 Keras-Preprocessing==1.1.2 keyring==19.2.0 kipoi==0.6.29 kipoi-conda==0.2.2 kipoi-utils==0.3.8 kipoiseq==0.4.1 kiwisolver==1.1.0 logger==1.4 majiq==1.1.3a0 Markdown==3.0.1 MarkupSafe==1.0 matplotlib==3.1.1 memoized-property==1.0.3 mhcflurry==1.0.0 mhcnames==0.4.8 mhctools==1.6.17 mistune==0.7.3 mmsplice==2.0.0 mock==2.0.0 mpmath==0.19 multidict==4.7.6 natsort==5.0.3 nbconvert==5.1.1 nbformat==4.3.0 ncls==0.0.53 networkx==2.2 nose==1.3.7 notebook==4.4.1 numpy==1.17.1 numpydoc==0.7.0 oauthlib==3.1.0 ont-albacore==2.3.1 ont-fast5-api==0.4.1 opt-einsum==3.3.0 oyaml==0.9 packaging==16.8 pandas==0.25.3 pandocfilters==1.4.1 paramiko==2.4.2 patsy==0.5.1 pbr==3.0.1 pdfkit==0.6.1 pexpect==4.2.1 pickleshare==0.7.4 plotly==2.4.1 poyo==0.5.0 progressbar33==2.4 prompt-toolkit==1.0.13 protobuf==3.12.2 psutil==5.4.3 ptyprocess==0.5.1 py2bit==0.3.0 py4j==0.10.7 pyasn1==0.4.4 pyasn1-modules==0.2.8 pybedtools==0.7.10 pyBigWig==0.3.10 pycparser==2.19 pyensembl==1.8.0 pyfaidx==0.5.9 Pygments==2.2.0 PyGraph==0.2.1 pyliftover==0.4 pymongo==3.6.1 PyNaCl==1.3.0 pypandoc==1.4 pyparsing==2.2.0 pyranges==0.0.79 pyrle==0.0.31 pysam==0.15.3 pyspark==2.3.1 python-dateutil==2.6.1 python-slugify==4.0.1 pytz==2018.3 PyVCF==0.6.8 PyYAML==5.3.1 pyzmq==16.0.2 qtconsole==4.2.1 quicksect==0.1.0 ratelimiter==1.2.0.post0 related==0.7.2 requests==2.24.0 requests-oauthlib==1.3.0 requests-toolbelt==0.9.1 roman==3.1 rpy2==2.9.0 rsa==4.6 s3transfer==0.1.13 scikit-learn==0.19.2 scipy==1.4.1 SecretStorage==3.1.1 sercol==0.1.4 serializable==0.1.1 Shapely==1.7.0 shellinford==0.4.0 simplegeneric==0.8.1 simplejson==3.16.0 six==1.15.0 snakemake==5.2.1 snowballstemmer==1.2.1 sorted-nearest==0.0.31 sortedcontainers==2.1.0 Sphinx==1.7.1 sphinxcontrib-websupport==1.0.1 spladder==2.4.2 SQLAlchemy==1.2.4 statsmodels==0.10.1 stevedore==1.23.0 sympy==1.0 tabulate==0.8.7 tensorboard==1.13.1 tensorboard-plugin-wit==1.7.0 tensorflow==1.13.1 tensorflow-estimator==1.13.0 tensorflow-gpu==1.13.1 termcolor==1.1.0 terminado==0.6 testpath==0.3 text-unidecode==1.3 tinydb==4.1.1 tinytimer==0.0.0 tornado==4.4.2 tqdm==4.48.0 traitlets==4.3.2 twobitreader==3.1.7 typechecks==0.1.0 typing-extensions==3.7.4.2 ulid-py==0.0.7 urllib3==1.22 varcode==0.7.0 vaxrank==1.0.0 virtualenv==15.1.0 virtualenv-clone==0.2.6 virtualenvwrapper==4.7.2 wcwidth==0.1.7 websockets==8.0.2 Werkzeug==0.14.1 whatshap==0.14.1 widgetsnbextension==1.2.6 wrapt==1.12.1 xlrd==1.1.0 XlsxWriter==1.1.2 xopen==0.3.2 xvfbwrapper==0.2.9 yarl==1.4.2 zipp==3.1.0

s6juncheng commented 3 years ago

Hi @rhalperin my apology for the late response. I can't locate what is wrong, could you try to update MMSplice to the latest version via pip, if the same error appears I can test with an updated pip freeze file from you to do a more detailed investigation.