RabadanLab / arcasHLA

Fast and accurate in silico inference of HLA genotypes from RNA-seq
GNU General Public License v3.0
116 stars 50 forks source link

scipy.stats.mode() function update results in failed reference construction #108

Closed huzuner closed 8 months ago

huzuner commented 1 year ago

Dear developers,

Thank you for your contribution to the field with your tool.

arcasHLA currently fails to build the reference database due to the update that was introduced in the stats.mode() function in the python library of scipy. Hence, the following line https://github.com/RabadanLab/arcasHLA/blob/04aab717b66fdf2dd6d96dd1f5ced36fc29e91a1/scripts/reference.py#L77 is affected since the default keepdims=None is updated to be False after 1.11.0. As a result,arcasHLA reference.py --version 3.32.0 throws an error with IndexError: invalid index to scalar variable. and failing to build the reference. To solve this problem, according to the new release of scipy (1.11.1),keepdims=True can be introduced in the function call of stats.mode(), then the error disappears for me and the reference construction seems to end successfully.

Here is the link to the change in function call: https://github.com/scipy/scipy/releases/tag/v1.11.0

arusa commented 1 year ago

I can confirm this. And when I downgrade to scipy-1.10.0 I can also see scipy warning about this change:

$ ./arcasHLA reference --update
arcasHLA/scripts/reference.py:77: FutureWarning: Unlike other reduction functions (e.g. `skew`, `kurtosis`), the default behavior of `mode` typically preserves the axis it acts along. In SciPy 1.11.0, this behavior will change: the default value of `keepdims` will become False, the `axis` over which the statistic is taken will be eliminated, and the value None will no longer be accepted. Set `keepdims` to True or False to avoid this warning.
  return stats.mode(lengths)[0][0]
abuendia commented 8 months ago

Thanks. Fixed in #120 so that the tool is now compatible with the newest scipy.