alteryx / featuretools

An open source python library for automated feature engineering
https://www.featuretools.com
BSD 3-Clause "New" or "Revised" License
7.2k stars 871 forks source link

`camel_and_title_to_snake` breaks for camelcase input #2195

Closed dvreed77 closed 2 years ago

dvreed77 commented 2 years ago

camel_and_title_to_snake breaks for input a_1 and any other letter plus number.

It returns a__1 and should return the input in this case a_1

Code Sample, a copy-pastable example to reproduce your bug.

from featuretools.utils.gen_utils import camel_and_title_to_snake

camel_and_title_to_snake("a_1")

Output of featuretools.show_info()

2022-07-20 11:01:38,411 featuretools - WARNING While loading primitives via "premium_primitives" entry point, ignored primitive "CountString" from "premium_primitives.count_string" because a primitive with that name already exists in "nlp_primitives.count_string" Featuretools version: 1.8.0 Featuretools installation directory: /env/lib/python3.8/site-packages/featuretools SYSTEM INFO ----------- python: 3.8.5.final.0 python-bits: 64 OS: Darwin OS-release: 21.5.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 INSTALLED VERSIONS ------------------ numpy: 1.22.0 pandas: 1.4.1 tqdm: 4.64.0 cloudpickle: 2.0.0 dask: 2022.5.0 distributed: 2022.5.0 psutil: 5.9.0 pip: 22.1.2 setuptools: 62.2.0
tamargrey commented 2 years ago

Might also be worth noting that the DFS error that occured because of this bug doesn't surface the naming error, making it more confusing: ValueError: Unknown transform primitive age_under_65. Call ft.primitives.list_primitives() to get a list of available primitives. It might be nice to have that error show the name as generated by camel_and_title_to_snake?