cytomining / pycytominer

Python package for processing image-based profiling data
https://pycytominer.readthedocs.io
BSD 3-Clause "New" or "Revised" License
78 stars 35 forks source link

refactor(pandas): enable copy_on_write for Pandas #401

Closed d33bs closed 6 months ago

d33bs commented 6 months ago

Description

This PR adds a configuration module and setting for Pandas (copy_on_write) which will eventually be turned on by default in Pandas >= 3.0.0 . Besides proactively avoiding issues with upgrades to Pandas, there are also possible performance benefits to doing this early (as per Pandas documentation on copy_on_write). I made these changes based on tests which were failing and noted in #366 .

After making this change, there are new test warnings concerning PerformanceWarning: DataFrame is highly fragmented. which may be worth investigating as part of this issue or possibly a new issue.

Closes #366

What is the nature of your change?

Checklist

Please ensure that all boxes are checked before indicating that a pull request is ready for review.

d33bs commented 6 months ago

Thanks @kenibrewer , merging this in!