zqfang / GSEApy

Gene Set Enrichment Analysis in Python
http://gseapy.rtfd.io/
BSD 3-Clause "New" or "Revised" License
564 stars 117 forks source link

if Item less than 15, dotplot can't work for GSEA result. #244

Closed li1311139481 closed 7 months ago

li1311139481 commented 9 months ago

Setup

I am reporting a problem with GSEApy version, Python version, and operating system as follows: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] CPython Linux-3.10.0-514.21.1.el7.x86_64-x86_64-with-glibc2.17 1.1.1

import sys; print(sys.version)
import platform; print(platform.python_implementation()); print(platform.platform())
import gseapy; print(gseapy.__version__)

(Please copy and run the above in your Python, and copy-and-paste the output) After run GSEA:

pre_res_2 = gp.prerank(rnk=rnk, gene_sets=gene_sets, threads=8,
                       permutation_num=1000, no_plot=True, max_size=1000, min_size=1)
res2 = pre_res_2.res2d
res2.head()

I get the output: image

Then i want plot a dotplot:

res_plot = res2.sort_values(by=["FDR q-val", 'FWER p-val', 'NES'],
                            ascending=[True, True, True], inplace=False).head(15)
print(res_plot.shape)
dotplot(res_plot,
        x="NES",
        column="FDR q-val",
        y_order=list(res_plot.sort_values(by=['NES'], ascending=[True])["Term"]))

I get what i want image

But when i use

res_plot = res2.sort_values(by=["FDR q-val", 'FWER p-val', 'NES'],
                            ascending=[True, True, True], inplace=False).head(10)
print(res_plot.shape)
dotplot(res_plot,
        x="NES",
        column="FDR q-val",
        y_order=list(res_plot.sort_values(by=['NES'], ascending=[True])["Term"]))

I get error like this

---------------------------------------------------------------------------
IntCastingNaNError                        Traceback (most recent call last)
Cell In[176], [line 4](vscode-notebook-cell:?execution_count=176&line=4)
      [1](vscode-notebook-cell:?execution_count=176&line=1) res_plot = res2.sort_values(by=["FDR q-val", 'FWER p-val', 'NES'],
      [2](vscode-notebook-cell:?execution_count=176&line=2)                             ascending=[True, True, True], inplace=False).head(10)
      [3](vscode-notebook-cell:?execution_count=176&line=3) print(res_plot.shape)
----> [4](vscode-notebook-cell:?execution_count=176&line=4) dotplot(res_plot,
      [5](vscode-notebook-cell:?execution_count=176&line=5)         x="NES",
      [6](vscode-notebook-cell:?execution_count=176&line=6)         column="FDR q-val",
      [7](vscode-notebook-cell:?execution_count=176&line=7)         y_order=list(res_plot.sort_values(by=['NES'], ascending=[True])["Term"]))

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1166, in dotplot(df, column, x, y, x_order, y_order, title, cutoff, top_term, size, figsize, cmap, ofname, xticklabels_rot, yticklabels_rot, marker, show_ring, **kwargs)
   [1148](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1148)     return
   [1150](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1150) dot = DotPlot(
   [1151](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1151)     df=df,
   [1152](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1152)     x=x,
   (...)
   [1164](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1164)     marker=marker,
   [1165](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1165) )
-> [1166](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1166) ax = dot.scatter(outer_ring=show_ring)
   [1168](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1168) if xticklabels_rot:
   [1169](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1169)     for label in ax.get_xticklabels():

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:822, in DotPlot.scatter(self, outer_ring)
    [813](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:813) # scatter colormap range
    [814](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:814) # df = df.assign(colmap=self.data[self.colname].round().astype("int"))
    [815](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:815) # make area bigger to better visualization
    [816](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:816) # area = df["Hits_ratio"] * plt.rcParams["lines.linewidth"] * 100
    [817](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:817) df = self.data.assign(
    [818](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:818)     area=(
    [819](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:819)         self.data["Hits_ratio"] * self.scale * plt.rcParams["lines.markersize"]
    [820](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:820)     ).pow(2)
    [821](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:821) )
--> [822](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:822) colmap = df[self.colname].astype(int)
    [823](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:823) vmin = np.percentile(colmap.min(), 2)
    [824](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:824) vmax = np.percentile(colmap.max(), 98)

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:6637, in NDFrame.astype(self, dtype, copy, errors)
   [6631](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:6631)     results = [
   [6632](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:6632)         ser.astype(dtype, copy=copy, errors=errors) for _, ser in self.items()
   [6633](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:6633)     ]
   [6635](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:6635) else:
   [6636](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:6636)     # else, only a single dtype is given
-> [6637](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:6637)     new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   [6638](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:6638)     res = self._constructor_from_mgr(new_data, axes=new_data.axes)
   [6639](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:6639)     return res.__finalize__(self, method="astype")

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:431, in BaseBlockManager.astype(self, dtype, copy, errors)
    [428](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:428) elif using_copy_on_write():
    [429](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:429)     copy = False
--> [431](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:431) return self.apply(
    [432](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:432)     "astype",
    [433](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:433)     dtype=dtype,
    [434](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:434)     copy=copy,
    [435](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:435)     errors=errors,
    [436](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:436)     using_cow=using_copy_on_write(),
    [437](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:437) )

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:364, in BaseBlockManager.apply(self, f, align_keys, **kwargs)
    [362](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:362)         applied = b.apply(f, **kwargs)
    [363](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:363)     else:
--> [364](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:364)         applied = getattr(b, f)(**kwargs)
    [365](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:365)     result_blocks = extend_blocks(applied, result_blocks)
    [367](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:367) out = type(self).from_blocks(result_blocks, self.axes)

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:758, in Block.astype(self, dtype, copy, errors, using_cow, squeeze)
    [755](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:755)         raise ValueError("Can not squeeze with more than one column.")
    [756](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:756)     values = values[0, :]  # type: ignore[call-overload]
--> [758](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:758) new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
    [760](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:760) new_values = maybe_coerce_values(new_values)
    [762](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:762) refs = None

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:237, in astype_array_safe(values, dtype, copy, errors)
    [234](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:234)     dtype = dtype.numpy_dtype
    [236](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:236) try:
--> [237](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:237)     new_values = astype_array(values, dtype, copy=copy)
    [238](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:238) except (ValueError, TypeError):
    [239](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:239)     # e.g. _astype_nansafe can fail on object-dtype of strings
    [240](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:240)     #  trying to convert to float
    [241](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:241)     if errors == "ignore":

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:182, in astype_array(values, dtype, copy)
    [179](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:179)     values = values.astype(dtype, copy=copy)
    [181](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:181) else:
--> [182](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:182)     values = _astype_nansafe(values, dtype, copy=copy)
    [184](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:184) # in pandas we don't store numpy str dtypes, so convert to object
    [185](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:185) if isinstance(dtype, np.dtype) and issubclass(values.dtype.type, str):

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:101, in _astype_nansafe(arr, dtype, copy, skipna)
     [96](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:96)     return lib.ensure_string_array(
     [97](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:97)         arr, skipna=skipna, convert_na_value=False
     [98](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:98)     ).reshape(shape)
    [100](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:100) elif np.issubdtype(arr.dtype, np.floating) and dtype.kind in "iu":
--> [101](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:101)     return _astype_float_to_int_nansafe(arr, dtype, copy)
    [103](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:103) elif arr.dtype == object:
    [104](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:104)     # if we have a datetime/timedelta array of objects
    [105](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:105)     # then coerce to datetime64[ns] and use DatetimeArray.astype
    [107](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:107)     if lib.is_np_dtype(dtype, "M"):

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:145, in _astype_float_to_int_nansafe(values, dtype, copy)
    [141](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:141) """
    [142](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:142) astype with a check preventing converting NaN to an meaningless integer value.
    [143](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:143) """
    [144](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:144) if not np.isfinite(values).all():
--> [145](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:145)     raise IntCastingNaNError(
    [146](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:146)         "Cannot convert non-finite values (NA or inf) to integer"
    [147](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:147)     )
    [148](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:148) if dtype.kind == "u":
    [149](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:149)     # GH#45151
    [150](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/astype.py:150)     if not (values >= 0).all():

IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer

Another question is that

print(res2.shape)
dotplot(res2,
        x="NES",
        column="FDR q-val", top_term=10)

the top_term is invalid image

If x is not specified, top_term works, but is not sorted by NES

print(res2.shape)
dotplot(res2,
        column="FDR q-val", top_term=10)

image

You might recommend using cutoff to control the filtering threshold of the fdr, but the FDR of the top is usually 0, so I prefer to use the absolute value of the NES in descending order

I just want to line up Term on the Y axis from bottom to top as NES increases on the X axis

#filter my data
res2['abs_NES'] = res2['NES'].abs()
res_plot = res2.sort_values(by=["FDR q-val", 'FWER p-val', 'abs_NES'],
                            ascending=[True, True, False], inplace=False,ignore_index=True).head(15)
# After get data. I should sort it for plot 
res_plot = res_plot.sort_values(by = ['NES'], ascending=[True], ignore_index=True)
res_plot
dotplot(res_plot,
        column="FDR q-val", top_term=15)

image I can't move the top Term to the bottom

li1311139481 commented 9 months ago

Hi; anyone can help ?

chrissymkcn commented 8 months ago

I am experiencing the same issue with gseapy '1.0.4' even with more than 15 top terms, have tried: converting the column to plot (FDR q-val) to np.float64 plotting other pval columns

li1311139481 commented 8 months ago

@chrissymkcn Thanks. I tried, but not work

li1311139481 commented 8 months ago

I solve the problem:

3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] CPython Linux-3.10.0-514.21.1.el7.x86_64-x86_64-with-glibc2.17 1.0.4

pre_res = gp.prerank(rnk=rnk, gene_sets=gene_sets, threads=8,
                     permutation_num=1000, no_plot=True, max_size=1000, min_size=5)
res = pre_res.res2d
res.index = res["Term"]
res['abs_NES'] = res['NES'].abs()

res_plot = res.sort_values(by=["FDR q-val", 'FWER p-val', 'abs_NES'],
                           ascending=[True, True, False], inplace=False, ignore_index=True).head(15)
res_plot = res_plot.sort_values(
     by=['NES'], ascending=[True], ignore_index=True)
if max(res_plot["FDR q-val"]) <= 0.05:
    dotplot(res_plot, column="FDR q-val", top_term=15, y_order=list(
        res_plot["Term"]), ofname=f'{out_file}/GSEA.{gs_name}.dotplot.pdf')
elif max(res_plot["FDR q-val"]) > 0.05:
    dotplot(
        res_plot,
        column="FDR q-val",
        cutoff=max(res_plot["FDR q-val"]),
        y_order=list(res_plot["Term"]),
        top_term=15, ofname=f'{out_file}/GSEA.{gs_name}.dotplot.pdf'
    )

Thanks

li1311139481 commented 7 months ago

Sorry to reopen the question, the problem still exists.

Although I have found a solution, I don't think it is the best way. So I reopened the question. Who has a better way?

3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] CPython Linux-3.10.0-514.21.1.el7.x86_64-x86_64-with-glibc2.17 1.0.4

print(res_plot)

Unnamed: 0 Name Term ES NES NOM p-val FDR q-val FWER p-val Tag % Gene % Lead_genes abs_NES
0 14 prerank MANNO_MIDBRAIN_NEUROTYPES_HPROGFPM 0.758843 3.383383 0.0 0.0 0.0 117/182 12.66% TTK;PTGR1;FOXM1;GINS1;GAS2;RFC4;PLK1;PBK;SPAG5... 3.383383
1 13 prerank FAN_EMBRYONIC_CTX_MICROGLIA_1 0.820998 3.410072 0.0 0.0 0.0 87/122 10.51% TTK;FOXM1;PLK1;PBK;SPAG5;FBXO5;SPC25;KIF2C;SHC... 3.410072
2 12 prerank HSD17B8_TARGET_GENES 0.719642 3.424900 0.0 0.0 0.0 170/329 11.98% TTK;EXO1;FOXM1;UNG;PSMC3IP;SYCE2;DTL;GINS1;MYB... 3.424900
3 11 prerank MANNO_MIDBRAIN_NEUROTYPES_HPROGBP 0.784367 3.432844 0.0 0.0 0.0 104/152 9.91% TTK;FOXM1;GINS1;MYBL2;SAPCD2;RFC4;PLK1;PBK;SPA... 3.432844
4 10 prerank FAN_EMBRYONIC_CTX_NSC_2 0.776916 3.441287 0.0 0.0 0.0 122/187 12.66% TUBB6;TTK;FOXM1;GINS1;SAPCD2;PLK1;PBK;SKA1;POC... 3.441287
5 9 prerank ROSTY_CERVICAL_CANCER_PROLIFERATION_CLUSTER 0.843112 3.486187 0.0 0.0 0.0 86/113 10.51% TTK;FOXM1;DTL;GINS1;MYBL2;PLK1;CDC6;PBK;SPAG5;... 3.486187
6 8 prerank ZHONG_PFC_C1_OPC 0.794613 3.498902 0.0 0.0 0.0 126/175 12.66% TUBB6;TTK;FOXM1;PLK1;PBK;SKA1;POC1A;SPAG5;TRIP... 3.498902
7 7 prerank BLANCO_MELO_BRONCHIAL_EPITHELIAL_CELLS_INFLUEN... 0.848646 3.501563 0.0 0.0 0.0 90/104 11.99% TTK;EXO1;FOXM1;DTL;MYBL2;PLK1;CDC6;SKA1;SPAG5;... 3.501563
8 6 prerank KOBAYASHI_EGFR_SIGNALING_24HR_DN 0.791011 3.502551 0.0 0.0 0.0 128/185 10.80% TUBB6;TTK;EXO1;UNG;PSMC3IP;DTL;RAD54B;GINS1;MY... 3.502551
9 5 prerank GSE13547_CTRL_VS_ANTI_IGM_STIM_BCELL_12H_UP 0.810430 3.536394 0.0 0.0 0.0 112/161 10.28% TTK;PTGR1;SYCE2;GINS1;FIGNL1;PLK1;CDC6;PBK;SKA... 3.536394
10 4 prerank GSE15750_DAY6_VS_DAY10_EFF_CD8_TCELL_UP 0.817446 3.556787 0.0 0.0 0.0 113/158 10.51% TTK;AUNIP;RAD54B;GINS1;MYBL2;FIGNL1;DSCC1;PLK1... 3.556787
11 3 prerank GOBERT_OLIGODENDROCYTE_DIFFERENTIATION_UP 0.745410 3.598120 0.0 0.0 0.0 239/399 11.78% SERPINE2;TTK;PTGR1;SUV39H2;EXO1;FOXM1;UNG;PSMC... 3.598120
12 2 prerank DUTERTRE_ESTRADIOL_RESPONSE_24HR_UP 0.797213 3.604187 0.0 0.0 0.0 159/213 12.70% TTK;SUV39H2;EXO1;FOXM1;UNG;PSMC3IP;DTL;RAD54B;... 3.604187
13 1 prerank FLORIO_NEOCORTEX_BASAL_RADIAL_GLIA_DN 0.857793 3.605230 0.0 0.0 0.0 96/117 10.51% TUBB6;TTK;EXO1;FOXM1;DTL;GINS1;MYBL2;PLK1;PBK;... 3.605230
14 0 prerank GSE15750_DAY6_VS_DAY10_TRAF6KO_EFF_CD8_TCELL_UP 0.846334 3.644140 0.0 0.0 0.0 122/155 10.65% TTK;PTGR1;EXO1;FOXM1;MLF1;AUNIP;DTL;MYBL2;PLK1... 3.644140

dotplot(
    res_plot,
    column="FDR q-val",
    top_term=15,
    y_order=list(res_plot["Term"]),
    ofname=f"{out_file}/GSEA.msigdb.dotplot.pdf",
)

---------------------------------------------------------------------------
IntCastingNaNError                        Traceback (most recent call last)
Cell In[23], [line 1](vscode-notebook-cell:?execution_count=23&line=1)
----> [1](vscode-notebook-cell:?execution_count=23&line=1) dotplot(
      [2](vscode-notebook-cell:?execution_count=23&line=2)     res_plot,
      [3](vscode-notebook-cell:?execution_count=23&line=3)     column="FDR q-val",
      [4](vscode-notebook-cell:?execution_count=23&line=4)     # top_term=15,
      [5](vscode-notebook-cell:?execution_count=23&line=5)     # y_order=list(res_plot["Term"]),
      [6](vscode-notebook-cell:?execution_count=23&line=6)     # ofname=f"{out_file}/GSEA.msigdb.dotplot.pdf",
      [7](vscode-notebook-cell:?execution_count=23&line=7) )

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1057, in dotplot(df, column, x, y, x_order, y_order, title, cutoff, top_term, size, figsize, cmap, ofname, xticklabels_rot, yticklabels_rot, marker, show_ring, **kwargs)
   [1039](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1039)     return
   [1041](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1041) dot = DotPlot(
   [1042](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1042)     df=df,
   [1043](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1043)     x=x,
   (...)
   [1055](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1055)     marker=marker,
   [1056](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1056) )
-> [1057](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1057) ax = dot.scatter(outer_ring=show_ring)
   [1059](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1059) if xticklabels_rot:
   [1060](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:1060)     for label in ax.get_xticklabels():

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:731, in DotPlot.scatter(self, outer_ring)
    [722](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:722) # scatter colormap range
    [723](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:723) # df = df.assign(colmap=self.data[self.colname].round().astype("int"))
    [724](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:724) # make area bigger to better visualization
    [725](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:725) # area = df["Hits_ratio"] * plt.rcParams["lines.linewidth"] * 100
    [726](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:726) df = self.data.assign(
    [727](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:727)     area=(
    [728](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:728)         self.data["Hits_ratio"] * self.scale * plt.rcParams["lines.markersize"]
    [729](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:729)     ).pow(2)
    [730](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:730) )
--> [731](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:731) colmap = df[self.colname].astype(int)
    [732](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:732) vmin = np.percentile(colmap.min(), 2)
    [733](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/gseapy/plot.py:733) vmax = np.percentile(colmap.max(), 98)

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5815, in NDFrame.astype(self, dtype, copy, errors)
   [5808](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5808)     results = [
   [5809](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5809)         self.iloc[:, i].astype(dtype, copy=copy)
   [5810](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5810)         for i in range(len(self.columns))
   [5811](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5811)     ]
   [5813](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5813) else:
   [5814](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5814)     # else, only a single dtype is given
-> [5815](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5815)     new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   [5816](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5816)     return self._constructor(new_data).__finalize__(self, method="astype")
   [5818](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/generic.py:5818) # GH 33113: handle empty frame or series

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:418, in BaseBlockManager.astype(self, dtype, copy, errors)
    [417](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:417) def astype(self: T, dtype, copy: bool = False, errors: str = "raise") -> T:
--> [418](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:418)     return self.apply("astype", dtype=dtype, copy=copy, errors=errors)

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:327, in BaseBlockManager.apply(self, f, align_keys, ignore_failures, **kwargs)
    [325](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:325)         applied = b.apply(f, **kwargs)
    [326](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:326)     else:
--> [327](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:327)         applied = getattr(b, f)(**kwargs)
    [328](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:328) except (TypeError, NotImplementedError):
    [329](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/managers.py:329)     if not ignore_failures:

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:591, in Block.astype(self, dtype, copy, errors)
    [573](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:573) """
    [574](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:574) Coerce to the new dtype.
    [575](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:575) 
   (...)
    [587](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:587) Block
    [588](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:588) """
    [589](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:589) values = self.values
--> [591](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:591) new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
    [593](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:593) new_values = maybe_coerce_values(new_values)
    [594](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/internals/blocks.py:594) newb = self.make_block(new_values)

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1309, in astype_array_safe(values, dtype, copy, errors)
   [1306](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1306) dtype = pandas_dtype(dtype)
   [1308](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1308) try:
-> [1309](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1309)     new_values = astype_array(values, dtype, copy=copy)
   [1310](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1310) except (ValueError, TypeError):
   [1311](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1311)     # e.g. astype_nansafe can fail on object-dtype of strings
   [1312](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1312)     #  trying to convert to float
   [1313](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1313)     if errors == "ignore":

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1257, in astype_array(values, dtype, copy)
   [1254](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1254)     values = values.astype(dtype, copy=copy)
   [1256](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1256) else:
-> [1257](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1257)     values = astype_nansafe(values, dtype, copy=copy)
   [1259](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1259) # in pandas we don't store numpy str dtypes, so convert to object
   [1260](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1260) if isinstance(dtype, np.dtype) and issubclass(values.dtype.type, str):

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1168, in astype_nansafe(arr, dtype, copy, skipna)
   [1165](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1165)     raise TypeError(f"cannot astype a timedelta from [{arr.dtype}] to [{dtype}]")
   [1167](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1167) elif np.issubdtype(arr.dtype, np.floating) and np.issubdtype(dtype, np.integer):
-> [1168](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1168)     return astype_float_to_int_nansafe(arr, dtype, copy)
   [1170](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1170) elif is_object_dtype(arr):
   [1171](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1171) 
   [1172](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1172)     # work around NumPy brokenness, #1987
   [1173](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1173)     if np.issubdtype(dtype.type, np.integer):

File /cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1213, in astype_float_to_int_nansafe(values, dtype, copy)
   [1209](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1209) """
   [1210](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1210) astype with a check preventing converting NaN to an meaningless integer value.
   [1211](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1211) """
   [1212](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1212) if not np.isfinite(values).all():
-> [1213](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1213)     raise IntCastingNaNError(
   [1214](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1214)         "Cannot convert non-finite values (NA or inf) to integer"
   [1215](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1215)     )
   [1216](https://vscode-remote+ssh-002dremote-002b172-002e19-002e247-002e27.vscode-resource.vscode-cdn.net/cluster/facility/hlhuang/miniconda3/envs/py/lib/python3.10/site-packages/pandas/core/dtypes/cast.py:1216) return values.astype(dtype, copy=copy)

IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer

I find that the reason for the error is that all my FDR q-val is 0, and if I assign a p-value to something else, it works. I guess this is because we need to provide a range for the FDR q-val value in order to produce a gradient color. For example:

res_plot.loc[14, ["FDR q-val"]] = 0.00001
dotplot(
    res_plot,
    column="FDR q-val",
    cutoff=max(res_plot["FDR q-val"]),
    y_order=list(res_plot["Term"]),
    top_term=15
)

image

Another example

res_plot.loc[0, ["FDR q-val"]] = 1e-20
res_plot.loc[14, ["FDR q-val"]] = 1e-5
dotplot(
    res_plot,
    column="FDR q-val",
    cutoff=max(res_plot["FDR q-val"]),
    y_order=list(res_plot["Term"]),
    top_term=15,
)

image The range of color bars is the same as the value I specified, so my guess should be correct

Next, I did this in order to draw smoothly in the automated pipeline, although I know that this is not an accurate representation of the real FDR q-val, but it is enough for the plotting.

if res_plot["FDR q-val"].sum() == 0:
    import numpy as np
    rank = np.arange(len(res_plot))
    min_val = 1e-9
    max_val = 1e-8
    values = min_val + (max_val - min_val) * (rank / (len(rank) - 1))
    res_plot["FDR q-val"] = values
dotplot(
    res_plot,
    column="FDR q-val",
    cutoff=max(res_plot["FDR q-val"]),
    y_order=list(res_plot["Term"]),
    top_term=15,
)

image

zqfang commented 7 months ago

I will try to investigate this. I just don't have enough time recently, sorry

Can you share me your data frame output? you can email it to me if you don't want to post it here

chrissymkcn commented 7 months ago

Mean while another workaround there is I think you can plot more terms (if you have) and export as svg and then adjust the graph manually if really necessary

li1311139481 commented 7 months ago

@zqfang Thank you for your reply, you don't have to apologize for it, we should thank you for your work This is my original data frame. GSEA.msigdb.stats.csv If you want to know what did, That's my pre-process code

res = pd.read_csv(
    GSEA.msigdb.stats.csv"
)
res.index = res["Term"]
res["abs_NES"] = res["NES"].abs()

res_plot = res.sort_values(
    by=["FDR q-val", "FWER p-val", "abs_NES"],
    ascending=[True, True, False],
    inplace=False,
    ignore_index=True,
).head(15)
res_plot = res_plot.sort_values(by=["NES"], ascending=[True], ignore_index=True)
if res_plot["FDR q-val"].sum() == 0:
    import numpy as np
    rank = np.arange(len(res_plot))
    min_val = 1e-9
    max_val = 1e-8
    values = min_val + (max_val - min_val) * (rank / (len(rank) - 1))
    res_plot["FDR q-val"] = values

dotplot(
    res_plot,
    column="FDR q-val",
    cutoff=max(res_plot["FDR q-val"]),
    y_order=list(res_plot["Term"]),
    top_term=15,
)
li1311139481 commented 7 months ago

Mean while another workaround there is I think you can plot more terms (if you have) and export as svg and then adjust the graph manually if really necessary

Haha, the slowest way, but actually the fastest way, and since I'm so obsessed with automation, I've been spending quite a bit of time on it

zqfang commented 7 months ago

@li1311139481 , so the initial issue for the dotplot is missing color values from the column FDR:

  1. Please ensure that the colormap column in your data frame does not consist entirely of zeros. ( this is where the bug is)
    • internally, the function will replace 0s with the lowest Pval, FDR values in your dataframe for visualization purpose).
    • I added a value check to help user input appropriate data (for example, when you subset the first 15 rows, they are all 0s).
  2. the second issue for the NES order in the x-axis, it's fixed now
li1311139481 commented 7 months ago

@zqfang. OK. Thanks for your help. I'll be waiting for updates in the next version of conda.