civisanalytics / civis-python

Civis API Python Client
BSD 3-Clause "New" or "Revised" License
34 stars 26 forks source link

Make use_pandas=True the default #362

Closed stephen-hoover closed 5 months ago

stephen-hoover commented 4 years ago

Several of this library's convenience functions (e.g. civis.io.read_civis) have the option to return their results as either a list or as a pandas DataFrame. Users control the return type by passing a boolean to a use_pandas keyword argument. This default to False.

History: We originally set this parameter to default to False because of a strong desire to require as few dependencies as possible. Users aren't required to install pandas, and users who choose not to install it should still have an error-free experience using the Civis API client.

Proposal: Change the default of use_pandas to True for all functions where it exists. In most cases, a DataFrame will be a more useful return type, and the default should reflect typical usage.

To maintain an error-free experience, we could check at the beginning of affected functions whether or not pandas is installed and force use_pandas=False where pandas is not present. (Possibly with a warning message.)

This would change the API of several functions, so it would need to be part of a v2 release.

jacksonlee-civis commented 5 months ago

I think it's time to close this ticket, as we likely won't make use_pandas=True the default at this point. The world has evolved quite a bit since this ticket was written. For dataframe packages in Python, it would seem like pandas was the de facto choice before, but recently other options have gained popularity, particularly polars. It'd therefore be more reasonable to keep civis-python not too tightly coupled with pandas, especially if/when at some future point we might consider deprecating use_pandas and implementing a more general return_type parameter to accept one of {list, pandas, polars, ...}.