pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.78k stars 17.97k forks source link

Replace old string formatting syntax with f-strings #29547

Closed ShaharNaveh closed 4 years ago

ShaharNaveh commented 5 years ago

Since we no longer support python 3.5, we can now use the new f-strings instead of the old .format() ( and obviously the % formatting).

Notes:


To check what files still needs to be fixed in the pandas directory:

grep -l -R '%s'  --include=*.{py,pyx} pandas/
grep -l -R '%d' --include=*.{py,pyx} pandas/
grep -l -R '\.format(' --include=*.{py,pyx} pandas/

All of the above can also be used as a one liner:

grep -l -R -e '%s' -e '%d' -e '\.format(' --include=*.{py,pyx} pandas/
Tip:

If you want to see the line number of the occurrence, replace the -l with -n for example:

grep -n -R '%s' --include=*.{py,pyx} pandas/

The current list is:


NOTE:

The list may change as files are moved/renamed constantly.


Inhereted files and commands from this PR.

bharatr21 commented 4 years ago

In #30601, I've taken on

rbharadwaj9 commented 4 years ago

I can take the two files below to start off with if that's fine!

These changes have been done in #30604

thepaullee commented 4 years ago

Taking on:

thepaullee commented 4 years ago

Is there a preferred way to note old intended uses of .format that are left as is, in PRs? In the case of (incorrectly) assuming something is a good use case and it's overlooked in the PR

Sangarshanan commented 4 years ago

Working on

AlfredoGJ commented 4 years ago

I'll take

ssikdar1 commented 4 years ago

I would like to try and take:

HH-MWB commented 4 years ago

I would like to take:

datapythonista commented 4 years ago

Thanks @HH-MWB

dinasv commented 4 years ago

I will take:

HH-MWB commented 4 years ago

I found a lot of code in @Appender() are using % to format string. Those code are using _shared_docs as a template, which is mostly defined in /pandas/core/generic.py and been used crossing multiple files.

I would like to replace all _shared_docs relevant formatting. This change will need to modified a lot of files, but I won't be able to check all other string formatting syntax in those files.

Does that sound good? Should I do it? @datapythonista @MomIsBestFriend

ShaharNaveh commented 4 years ago

I found a lot of code in @Appender() are using % to format string. Those code are using _shared_docs as a template, which is mostly defined in /pandas/core/generic.py and been used crossing multiple files.

I would like to replace all _shared_docs relevant formatting. This change will need to modified a lot of files, but I won't be able to check all other string formatting syntax in those files.

Does that sound good? Should I do it? @datapythonista @MomIsBestFriend

@HH-MWB I don't really have a say on this, I think jreback and WillAyd and datapythonista (Not tagging because I don't want to bother them), can help you out more than I can :)

WillAyd commented 4 years ago

What are you trying to replace with Appender? I don’t think can be f-strings

Would be ok with .format replacing the Py27 syntax but probably worth opening a separate issue to discuss

ShaharNaveh commented 4 years ago

What are you trying to replace with Appender? I don’t think can be f-strings

Would be ok with .format replacing the Py27 syntax but probably worth opening a separate issue to discuss

@WillAyd I do think that string.Template from stdlib, is the right method to this. Any thoughts?

HH-MWB commented 4 years ago

Hi @WillAyd, sorry I didn't make it clear. Yes, my original idea was replacing % by .format, and also replacing code like %(XXX)s to be {XXX} in the _shared_docs template. Like @MomIsBestFriend said, string.Template would be another choice.

I opened a separate issue for more discussion. Thanks!

vandana-iyer commented 4 years ago

Taking on:

ref #31412

drewseibert commented 4 years ago

@MomIsBestFriend Hello! First time open-source contributor here! I am very excited for my first PR! I will try working on the following files: versioneer.py web/pandas_web.py Thanks!!

jbrockmendel commented 4 years ago

@drewseibert versioneer.py is vendored, so we dont want to edit it @MomIsBestFriend can you remove this from the list to avoid this confusion

drewseibert commented 4 years ago

@jbrockmendel Thanks for the heads up. Along with that, it appears the other file I mentioned "web/pandas_web.py" has been worked on already as well. Both can be removed from the work list.

drewseibert commented 4 years ago

Also, I am getting 403 permissions error when I try to push a commit. I added an SSH key and tried setting the remote URL. Not working for me whether I clone with SSH or HTTPS. Any help is appreciated! Thanks!

ShaharNaveh commented 4 years ago

@MomIsBestFriend Hello! First time open-source contributor here! I am very excited for my first PR! I will try working on the following files: versioneer.py web/pandas_web.py Thanks!!

Good luck @drewseibert

abbiepopa commented 4 years ago

I'll take:

pandas/core/reshape/concat.py pandas/core/reshape/melt.py pandas/core/reshape/merge.py pandas/core/reshape/pivot.py pandas/core/reshape/reshape.py

drewseibert commented 4 years ago

Working on this one now:

pandas/tests/io/test_pickle.py

leandermaben commented 4 years ago

Hi I'll take

pandas/util/_print_versions.py

pandas/util/_test_decorators.py

drewseibert commented 4 years ago

https://github.com/pandas-dev/pandas/pull/31628 should be okay :)

thomasjpfan commented 4 years ago

I'll take pandas/tests/frame/test_repr_info.py

ref: https://github.com/pandas-dev/pandas/pull/31639

MarcoNasc commented 4 years ago

Hey, I'll take

drewseibert commented 4 years ago

The following files can be checked off on the list... web/pandas_web.py pandas/tests/io/test_pickle.py

Thanks!

drewseibert commented 4 years ago

Another one to check off.. no f-strings needed in the file: pandas/tests/series/indexing/test_boolean.py

drewseibert commented 4 years ago

I will work on this one now...

pandas/tests/series/indexing/test_indexing.py

ShaharNaveh commented 4 years ago

Thank you @drewseibert

3vts commented 4 years ago

Is there still work remaining here? I want to contribute

ShaharNaveh commented 4 years ago

@3vts Yes, of course :)

I think you can take

pandas/tests/util/test_assert_extension_array_equal.py


LMK if you want more.

TBERB commented 4 years ago

Something I can help with! It will be my first open source contribution, so I may need some help. I have read some how to contribute articles, but still.

Which ones would you like me to handle?

3vts commented 4 years ago

@MomIsBestFriend seems like pandas/tests/util/test_assert_extension_array_equal.py was fixed on PR #30816, also, I already have the environment set up. Can you give me some load to work with?

alimcmaster1 commented 4 years ago

@3vts @GrizzledLabs - feel free to take any of the files that haven’t been done yet in the list above (and check no one else is working on it) - then comment on here what you are working on! Thanks !

TBERB commented 4 years ago

pandas/core/arrays/boolean.py appears to already be done. I saw one f'string and no .format(), unless I missed it.

pandas/core/dtypes/common.py appears to be done already as well. f'strings but no .format()

Are some of these fixes spanning multiple files? A few don't contain a single .format(), and I am wondering if there are functions called between files? Would a single fix require changing multiple files?

Roman-Ka commented 4 years ago

Hi, First time contributer here! Excited to get going! Initially I wanted to take these:

pandas/compat/pickle_compat.py pandas/_config/config.py

but then I saw it's been done and merged, @MomIsBestFriend can you please update the list at the top to tick them as done?

I will take these:

3vts commented 4 years ago

@MomIsBestFriend reviewing the thread I found there is an exception for the predefined strings. Does this still apply? Or we have now. a workaround?

I have problem with predefined strings.

I completely understand, As pep 498 explains:

Regular strings are concatenated at compile time, and f-strings are concatenated at run time.

We need to think of a way to remove the use of .format() and use something else a string template.

The only thing I can think of at the moment is string.Template from stdlib, but I really don't know.

@jbrockmendel Can you help us out?

monicaw218 commented 4 years ago

Hi there, first-time contributor 👋 . I'll take:

datapythonista commented 4 years ago

@monicaw218, I'd start with just one file, and once your pull request is merged you can continue with the rest. The first contribution is usually trickier than expected, and for us (reviewers) it usually helps too if pull requests are small. Specially for new contributors, where more feedback may be needed.

drewseibert commented 4 years ago

These two files can be checked off the list: 👍 pandas/io/parsers.py pandas/io/pytables.py

drewseibert commented 4 years ago

These are also good to go:

pandas/tests/groupby/test_apply.py pandas/tests/groupby/test_bin_groupby.py

3vts commented 4 years ago

These are ready on #31914 "pandas/tests/extension/decimal/test_decimal.py" "pandas/tests/frame/indexing/test_categorical.py" "pandas/tests/frame/methods/test_describe.py" "pandas/tests/frame/methods/test_duplicated.py" "pandas/tests/frame/methods/test_to_dict.py" "pandas/tests/frame/test_alter_axes.py" "pandas/tests/frame/test_api.py" "pandas/tests/frame/test_constructors.py" "pandas/tests/frame/test_dtypes.py" "pandas/tests/frame/test_join.py"

alysbrooks commented 4 years ago

I'll take "pandas/io/sas/sas_xport.py".

3vts commented 4 years ago

These are ready on #31933

"pandas/tests/frame/test_operators.py" "pandas/tests/frame/test_reshape.py" "pandas/tests/frame/test_timeseries.py" "pandas/tests/indexes/datetimes/test_scalar_compat.py" "pandas/tests/indexes/datetimes/test_tools.py" "pandas/tests/indexes/interval/test_indexing.py" "pandas/tests/indexes/interval/test_interval.py"

3vts commented 4 years ago

These are included in #31945

"pandas/tests/indexes/interval/test_setops.py" "pandas/tests/indexes/multi/test_compat.py" "pandas/tests/indexes/period/test_constructors.py" "pandas/tests/indexes/timedeltas/test_constructors.py" "pandas/tests/indexing/test_floats.py"

3vts commented 4 years ago

These are included in #31963

"pandas/tests/internals/test_internals.py" "pandas/tests/io/excel/test_readers.py" "pandas/tests/io/excel/test_style.py" "pandas/tests/io/excel/test_writers.py" "pandas/tests/io/excel/test_xlrd.py" "pandas/tests/io/formats/test_console.py" "pandas/tests/io/formats/test_to_html.py" "pandas/tests/io/formats/test_to_latex.py" "pandas/tests/io/generate_legacy_storage_files.py"

3vts commented 4 years ago

These are included in #31967

"pandas/tests/io/parser/test_c_parser_only.py" "pandas/tests/io/parser/test_common.py" "pandas/tests/io/parser/test_compression.py" "pandas/tests/io/parser/test_encoding.py" "pandas/tests/io/parser/test_multi_thread.py" "pandas/tests/io/parser/test_na_values.py" "pandas/tests/io/parser/test_parse_dates.py" "pandas/tests/io/parser/test_read_fwf.py" "pandas/tests/io/pytables/conftest.py" "pandas/tests/io/pytables/test_store.py"

3vts commented 4 years ago

These are included in #31980

"pandas/tests/io/pytables/test_timezones.py" "pandas/tests/io/test_html.py" "pandas/tests/io/test_stata.py" "pandas/tests/resample/test_period_index.py" "pandas/tests/reshape/merge/test_join.py" "pandas/tests/reshape/merge/test_merge.py" "pandas/tests/reshape/merge/test_merge_asof.py" "pandas/tests/reshape/test_melt.py" "pandas/tests/reshape/test_pivot.py" "pandas/tests/scalar/timedelta/test_constructors.py"