Hello.
I have discovered a performance degradation in the read_csv function of pandas version below 2.0.1. And I notice some parts of the repository depend on pandas 2.0.0 in featuretools/tests/requirement_files/minimum_core_requirements.txt and some other dependencies require pandas below 2.0.1. I am not sure whether this performance problem in pandas will affect this repository. I found some discussions on pandas GitHub related to this issue, including #52546 and #52548.
I also found that featuretools/primitives/standard/transform/email/is_free_email_domain.py and featuretools/tests/computational_backend/test_calculate_feature_matrix.py used the influenced api. There may be more files using the influenced api.
Suggestion
I would recommend considering an upgrade to a different version of pandas >= 2.0.1 or exploring other solutions to optimize the performance of read_csv.
Any other workarounds or solutions would be greatly appreciated.
Thank you!
Issue Description:
Hello. I have discovered a performance degradation in the
read_csv
function of pandas version below 2.0.1. And I notice some parts of the repository depend on pandas 2.0.0 infeaturetools/tests/requirement_files/minimum_core_requirements.txt
and some other dependencies require pandas below 2.0.1. I am not sure whether this performance problem in pandas will affect this repository. I found some discussions on pandas GitHub related to this issue, including #52546 and #52548. I also found thatfeaturetools/primitives/standard/transform/email/is_free_email_domain.py
andfeaturetools/tests/computational_backend/test_calculate_feature_matrix.py
used the influenced api. There may be more files using the influenced api.Suggestion
I would recommend considering an upgrade to a different version of pandas >= 2.0.1 or exploring other solutions to optimize the performance of
read_csv
. Any other workarounds or solutions would be greatly appreciated. Thank you!