quinlan-lab / STRling

Detect novel (and reference) STR expansions from short-read data
MIT License
60 stars 9 forks source link

strling-outliers.py: Incompatibility with newer Pandas version #119

Open ryanmcggg opened 4 months ago

ryanmcggg commented 4 months ago

Hello It looks like the strling-outliers.py script uses a Pandas indexing functionality that is no longer supported. I currently have Pandas 2.0.3, which I believe conda installed automatically following the strling instructions. It looks like you will have to re-write the pandas indexing to keep up to date. In the meantime, can you tell me which version of pandas you are using so that I can downgrade? Thanks! Error log below:

Elapsed time: 5:44:18 Calculating z scores Traceback (most recent call last): File "/ENVIRONMENT_PATH/bin/strling-outliers.py", line 459, in main() File "/ENVIRONMENT_PATH/bin/strling-outliers.py", line 340, in main z = z_score(sum_str_log_wide, mu_sd_estimates) File "/ENVIRONMENT_PATH/bin/strling-outliers.py", line 141, in z_score return (x - df['mu'][:,np.newaxis])/df['sd'][:,np.newaxis] File "/ENVIRONMENT_PATH/lib/python3.8/site-packages/pandas/core/series.py", line 1033, in getitem return self._get_with(key) File "/ENVIRONMENT_PATH/lib/python3.8/site-packages/pandas/core/series.py", line 1048, in _get_with return self._get_values_tuple(key) File "/ENVIRONMENT_PATH/lib/python3.8/site-packages/pandas/core/series.py", line 1082, in _get_values_tuple disallow_ndim_indexing(result) File "/ENVIRONMENT_PATH/lib/python3.8/site-packages/pandas/core/indexers/utils.py", line 343, in disallow_ndim_indexing raise ValueError( ValueError: Multi-dimensional indexing (e.g. obj[:, None]) is no longer supported. Convert to a numpy array before indexing instead.

gavinmonahan commented 4 months ago

Hi all,

I am having the same issue and would love to know which version of pandas to use. It seems to be a conflict with statsmodel, there are a few workarounds here, including downgrading to pandas 1.5.3 which I have not tried. I also noted seaborn and matplotlib are being installed by conda which reportedly cause the same issue, if they are not needed could they be removed from the conda package? This issue was also reported on the nf pipeline page here

Thanks 🙂 Gavin

gavinmonahan commented 4 months ago

Confirming I downgraded pandas from 2.2.0 to 1.5.3 and the error became a warning. Could you please check if this is ok, and if so, update the docker image? Thank you 🙂

ryanmcggg commented 4 months ago

The downgrade to Pandas 1.5.3 did work for me also. Thanks Gavin! It would be a good idea to update the script to keep up with the latest Pandas requirements though. Thanks, Ryan

hdashnow commented 4 months ago

Thank you both for catching the source of the issue! I'll work on updating to work with the latest version of Pandas.