mortazavilab / swan_vis

A Python library to visualize and analyze long-read transcriptomes
https://freese.gitbook.io/swan/
MIT License
57 stars 11 forks source link

ES IR error #6

Closed ashokpatowary closed 4 years ago

ashokpatowary commented 4 years ago

Hi @fairliereese

I am having the following error while looking for exon skipping and intron retention events. Other modules are working fine. Can you please look into it.

Thanks

es_genes = sg.find_es_genes()
print(es_genes[:5])

~/.local/lib/python3.7/site-packages/swan_vis/swangraph.py in find_es_genes(self)
   1092                         sub_G = self.G.subgraph(sub_nodes)
   1093                         sub_edges = list(sub_G.edges())
-> 1094                         sub_edges = self.edge_df.loc[sub_edges]
   1095                         sub_edges = sub_edges.loc[sub_edges.edge_type == 'exon']
   1096 

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
   1422 
   1423             maybe_callable = com.apply_if_callable(key, self.obj)
-> 1424             return self._getitem_axis(maybe_callable, axis=axis)
   1425 
   1426     def _is_scalar_access(self, key: Tuple):

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1837                     raise ValueError("Cannot index with multidimensional key")
   1838 
-> 1839                 return self._getitem_iterable(key, axis=axis)
   1840 
   1841             # nested tuple slicing

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
   1131         else:
   1132             # A collection of keys
-> 1133             keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
   1134             return self.obj._reindex_with_indexers(
   1135                 {axis: [keyarr, indexer]}, copy=True, allow_dups=True

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
   1090 
   1091         self._validate_read_indexer(
-> 1092             keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing
   1093         )
   1094         return keyarr, indexer

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
   1175                 raise KeyError(
   1176                     "None of [{key}] are in the [{axis}]".format(
-> 1177                         key=key, axis=self.obj._get_axis_name(axis)
   1178                     )
   1179                 )

KeyError: "None of [Index([(519693, 519694)], dtype='object', name='edge_id')] are in the [index]"

ir_genes = sg.find_ir_genes()
print(ir_genes[:5])

~/.local/lib/python3.7/site-packages/swan_vis/swangraph.py in find_ir_genes(self)
   1019                         sub_G = self.G.subgraph(sub_nodes)
   1020                         sub_edges = list(sub_G.edges())
-> 1021                         sub_edges = self.edge_df.loc[sub_edges]
   1022                         sub_edges = sub_edges.loc[sub_edges.edge_type == 'intron']
   1023 

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
   1422 
   1423             maybe_callable = com.apply_if_callable(key, self.obj)
-> 1424             return self._getitem_axis(maybe_callable, axis=axis)
   1425 
   1426     def _is_scalar_access(self, key: Tuple):

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1837                     raise ValueError("Cannot index with multidimensional key")
   1838 
-> 1839                 return self._getitem_iterable(key, axis=axis)
   1840 
   1841             # nested tuple slicing

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_iterable(self, key, axis)
   1131         else:
   1132             # A collection of keys
-> 1133             keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
   1134             return self.obj._reindex_with_indexers(
   1135                 {axis: [keyarr, indexer]}, copy=True, allow_dups=True

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
   1090 
   1091         self._validate_read_indexer(
-> 1092             keyarr, indexer, o._get_axis_number(axis), raise_missing=raise_missing
   1093         )
   1094         return keyarr, indexer

~/.local/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
   1175                 raise KeyError(
   1176                     "None of [{key}] are in the [{axis}]".format(
-> 1177                         key=key, axis=self.obj._get_axis_name(axis)
   1178                     )
   1179                 )

KeyError: "None of [Index([(521455, 521457)], dtype='object', name='edge_id')] are in the [index]"
fairliereese commented 4 years ago

I'm assuming this is with the data you've already shared with me? I'll try running it on my end.

ashokpatowary commented 4 years ago

@fairliereese

Yah I am having it in that dataset also.

Thanks

fairliereese commented 4 years ago

It looks like this is a pandas version issue. What version of pandas are you running? You can find out by running the following in Python.

import pandas as pd
print(pd.__version__)
ashokpatowary commented 4 years ago

Its 0.25.2

ashokpatowary commented 4 years ago

@fairliereese on which version it's working? I can install that specific version.

fairliereese commented 4 years ago

I am still trying to figure it out. I just got it to work on version 1.0.5 but I'm having trouble recreating the error that I got the first time around.

ashokpatowary commented 4 years ago

Thanks @fairliereese

Bellow is the list version of pandas dependencies I am having. If it helps anyways


INSTALLED VERSIONS
------------------
commit           : None
python           : 3.7.0.final.0
python-bits      : 64
OS               : Linux
OS-release       : 2.6.32-754.23.1.el6.x86_64
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 0.25.2
numpy            : 1.17.3
pytz             : 2019.3
dateutil         : 2.8.0
pip              : 19.1.1
setuptools       : 41.4.0
Cython           : 0.29.7
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.3.3
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : 2.10.3
IPython          : 7.8.0
pandas_datareader: None
bs4              : None
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.3.3
matplotlib       : 3.1.1
numexpr          : None
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : None
pytables         : None
s3fs             : None
scipy            : 1.4.1
sqlalchemy       : None
tables           : None
xarray           : None
xlrd             : None
xlwt             : None
xlsxwriter       : None

fairliereese commented 4 years ago

Looks like my first intuition was wrong. I missed a case in implementing support for GTFs with entries in the order that yours are... I'll see what I can do. Thanks for being a great beta tester!

fairliereese commented 4 years ago

Okay, I think everything should be in working order for you now. I was able to run everything with your data!

Unfortunately after installing the new version you will have to re-add all your datasets to the SwanGraph. I hope this works!

ashokpatowary commented 4 years ago

Thank @fairliereese for your quick response. I will re-add it and will proceed fresh.

Thanks again.