guma44 / GEOparse

Python library to access Gene Expression Omnibus Database (GEO)
BSD 3-Clause "New" or "Revised" License
137 stars 51 forks source link

Failing test: `test_merge_and_average` fails with a TypeError with pandas 2.0.3 #84

Open olp-cs opened 1 month ago

olp-cs commented 1 month ago

The test is failing on platforms:

Update: The test passes with Python 3.8 and pandas 1.3.0; it seems to be a backward compatibility issue with pandas.

Failed test:

test_GEOparse.py:626 (TestGSE.test_merge_and_average)

TypeError: agg function failed [how->mean,dtype->object]

TypeError: Could not convert string 'DNA segment, Chr 8, ERATO Doi 594, expressed' to numeric

The test fails on this line:

..\src\GEOparse\GEOTypes.py:445: in annotate_and_average

tmp_data = tmp_data.groupby(group_by_column).mean()[[expression_column]]

where

Jupyter notebook reproducing the issue: https://gist.github.com/olp-cs/9902b5cdc554afbf3faa7127ee602f20

Would it make sense to filter the columns first, to keep the numerical ones only?