Open olp-cs opened 1 month ago
Update: The test passes with Python 3.8 and pandas 1.3.0; it seems to be a backward compatibility issue with pandas.
test_GEOparse.py:626 (TestGSE.test_merge_and_average)
test_GEOparse.py:626
TestGSE.test_merge_and_average
TypeError: agg function failed [how->mean,dtype->object] TypeError: Could not convert string 'DNA segment, Chr 8, ERATO Doi 594, expressed' to numeric
The test fails on this line:
..\src\GEOparse\GEOTypes.py:445: in annotate_and_average
..\src\GEOparse\GEOTypes.py:445
annotate_and_average
tmp_data = tmp_data.groupby(group_by_column).mean()[[expression_column]]
where
tmp_data
expression_column
group_by_column
Jupyter notebook reproducing the issue: https://gist.github.com/olp-cs/9902b5cdc554afbf3faa7127ee602f20
Would it make sense to filter the columns first, to keep the numerical ones only?
The test is failing on platforms:
Update: The test passes with Python 3.8 and pandas 1.3.0; it seems to be a backward compatibility issue with pandas.
Failed test:
test_GEOparse.py:626
(TestGSE.test_merge_and_average
)The test fails on this line:
..\src\GEOparse\GEOTypes.py:445
: inannotate_and_average
where
tmp_data
is a pandas dataframe that contains both numeric and string columns (attached: tmp_data.csv);expression_column
= 'VALUE'group_by_column
= 'GB_ACC'Jupyter notebook reproducing the issue: https://gist.github.com/olp-cs/9902b5cdc554afbf3faa7127ee602f20
Would it make sense to filter the columns first, to keep the numerical ones only?