CGATOxford / UMI-tools

Tools for handling Unique Molecular Identifiers in NGS data sets
MIT License
493 stars 190 forks source link

Potential problem with Pandas #615

Closed kvn95ss closed 8 months ago

kvn95ss commented 1 year ago

Hello,

When using the latest version of UMI_tools from conda, I got the below warning message -

/home/user/anaconda3/envs/umitools/lib/python3.10/site-packages/umi_tools/dedup.py:171: FutureWarning: The provided callable <function median at 0x7fbdd03d67a0> is currently using SeriesGroupBy.median. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "median" instead.                                                                                                                                                                          agg_df = grouped.agg(agg_dict)
/home/user/anaconda3/envs/umitools/lib/python3.10/site-packages/umi_tools/dedup.py:171: FutureWarning: The provided callable <function sum at 0x7fbe106639a0> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.           agg_df = grouped.agg(agg_dict)
/home/user/anaconda3/envs/umitools/lib/python3.10/site-packages/umi_tools/dedup.py:171: FutureWarning: The provided callable <function median at 0x7fbdd03d67a0> is currently using SeriesGroupBy.median. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "median" instead.                                                                                                                                                                          agg_df = grouped.agg(agg_dict)
/home/user/anaconda3/envs/umitools/lib/python3.10/site-packages/umi_tools/dedup.py:171: FutureWarning: The provided callable <function sum at 0x7fbe106639a0> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.           agg_df = grouped.agg(agg_dict)
/home/user/anaconda3/envs/umitools/lib/python3.10/site-packages/umi_tools/dedup.py:448: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value 'Single_UMI' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.                                  edit_distance_df['edit_distance'][0] = "Single_UMI"

While not exactly an error for now, it might lead to incompatibility with future versions of Pandas, so do have a look at it!

Thanks, Karthik

IanSudbery commented 1 year ago

What a pain. Thanks for letting us know!