pandas-dev / pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
https://pandas.pydata.org
BSD 3-Clause "New" or "Revised" License
43.78k stars 17.97k forks source link

Replace old string formatting syntax with f-strings #29547

Closed ShaharNaveh closed 4 years ago

ShaharNaveh commented 5 years ago

Since we no longer support python 3.5, we can now use the new f-strings instead of the old .format() ( and obviously the % formatting).

Notes:


To check what files still needs to be fixed in the pandas directory:

grep -l -R '%s'  --include=*.{py,pyx} pandas/
grep -l -R '%d' --include=*.{py,pyx} pandas/
grep -l -R '\.format(' --include=*.{py,pyx} pandas/

All of the above can also be used as a one liner:

grep -l -R -e '%s' -e '%d' -e '\.format(' --include=*.{py,pyx} pandas/
Tip:

If you want to see the line number of the occurrence, replace the -l with -n for example:

grep -n -R '%s' --include=*.{py,pyx} pandas/

The current list is:


NOTE:

The list may change as files are moved/renamed constantly.


Inhereted files and commands from this PR.

ShaharNaveh commented 5 years ago

Im taking:

yashukla commented 5 years ago

I'll take:

to start, if that's alright!

SaturnFromTitan commented 5 years ago

Hi @MomIsBestFriend Can you recommend any tools for this conversion? A quick look gave me these:

  1. pyupgrade
  2. fstringify
  3. flynt

I have no experience with either of them, but they could be very helpful here

ShaharNaveh commented 5 years ago

Hello @SaturnFromTitan , I personally sometimes use pyupgrade but only when the file contain only a few outdated string formats in it. Then I look at the changes and fix if pyupgrade got something wrong.

When they're files with alot of occurrences I go for the "complex" ones manually (e.g '%.2f' % my_float) and let it deal with the common ones, usually it gets it right.

Also, some of the changes will make the changed file non pep8 compatible, so there's a need to fix that as well, otherwise it will not pass the tests.

ShaharNaveh commented 5 years ago

Will take next:

yashukla commented 5 years ago

I'll take:

What are everyone's thoughts on tagging this as a good first issue? It should apply to most of the files here. The changes that need to be made are usually only a few lines or so per file, and whoever is making the changes doesn't need to worry too much about affecting other parts of the code (since the end function performed is the same).

I'm picturing a setup similar to #28926.

lucassa3 commented 4 years ago

f-string replacement placed on:

ref #29701

ShaharNaveh commented 4 years ago

Will take next:

ohad83 commented 4 years ago

I'll take

ref #29781

alimcmaster1 commented 4 years ago

I'll take:

  • [x] pandas/core/reshape/concat.py
  • [x] pandas/core/reshape/melt.py
  • [ ] pandas/core/reshape/merge.py
  • [x] pandas/core/reshape/pivot.py
  • [x] pandas/core/reshape/reshape.py
  • [ ] pandas/core/reshape/tile.py

What are everyone's thoughts on tagging this as a good first issue? It should apply to most of the files here. The changes that need to be made are usually only a few lines or so per file, and whoever is making the changes doesn't need to worry too much about affecting other parts of the code (since the end function performed is the same).

I'm picturing a setup similar to #28926.

Sure ive labelled accordingly. thanks

ShaharNaveh commented 4 years ago

Taking next:

ForTimeBeing commented 4 years ago

Sorry I just noticed that you have asked to specify which files to work on. I just have been using

grep -n -R -e '%s' -e '%d' -e '.format(' --include=*.{py,pyx} pandas/

To find any old formatting. I apologize

ShaharNaveh commented 4 years ago

Sorry I just noticed that you have asked to specify which files to work on. I just have been using

@ForTimeBeing That's why I edited the post, glad you noticed:)

can you post what you worked on? just in case someone searches the comments.

ForTimeBeing commented 4 years ago

Sure, I took;

and under

.format still exists and shows in the grep search, but there are no literals to change to fstring. Not sure if there is another way to do it or keep as is but all literals are swapped to fstring now in that file.

ShaharNaveh commented 4 years ago

.format still exists and shows in the grep search, but there are no literals to change to fstring. Not sure if there is another way to do it or keep as is but all literals are swapped to fstring now in that file.

@ForTimeBeing No problems:) thank you for the PR:)

ganevgv commented 4 years ago

I took

ref #29952

ShaharNaveh commented 4 years ago

I'll take:

ZGrinacoff commented 4 years ago

Working on: 'pandas/core/dtypes/dtypes.py'

AlpAribal commented 4 years ago

Took:

ref: #30116, #30135, #30363

Behemkot commented 4 years ago

I'll take:

I have problem with predefined strings. I found the solution but I'm not sure if it's the right one. Imagine the situation where you have predefined string like: THE_MESSAGE = "Message with arguments. Arg1: {arg1}, Arg2: {arg2}." Which is called with .format() like: THE_MESSAGE.format(arg1_str, arg2_str).

Could I rewrite this by using lambda function in a way described below? THE_MESSAGE = lambda arg1, arg2: f"Message with arguments. Arg1: {arg1}, Arg2: {arg2}."

and call it by THE_MESSAGE(arg1_str, arg2_str)?

I know that would work I'm not sure if it's the best way to approach this problem :)

ShaharNaveh commented 4 years ago

I have problem with predefined strings.

I completely understand, As pep 498 explains:

Regular strings are concatenated at compile time, and f-strings are concatenated at run time.

We need to think of a way to remove the use of .format() and use something else a string template.

The only thing I can think of at the moment is string.Template from stdlib, but I really don't know.

@jbrockmendel Can you help us out?

Behemkot commented 4 years ago

@MomIsBestFriend do you thing that lambda function is an overkill for this?

jbrockmendel commented 4 years ago

@MomIsBestFriend i think this may be a case where living with a few .formats is the way to go

ShaharNaveh commented 4 years ago

@MomIsBestFriend do you thing that lambda function is an overkill for this?

I'm no way near an expert, please ask one of the developers.

ganevgv commented 4 years ago

took

ref #30120

ganevgv commented 4 years ago

took

ref #30121

ganevgv commented 4 years ago

took

ref #30124

lithomas1 commented 4 years ago

I'll take

Edit: Taking:

katieyounglove commented 4 years ago

I'll take

thanks!

jlamborn324 commented 4 years ago

I'll take:

pandas/tests/plotting/test_converter.py

pandas/tests/plotting/test_datetimelike.py

pandas/tests/plotting/test_series.py

makeajourney commented 4 years ago

I'll take

30273

jlamborn324 commented 4 years ago

Hello, @MomIsBestFriend

pandas/tests/plotting/test_converter.py

pandas/tests/plotting/test_datetimelike.py

pandas/tests/plotting/test_series.py

Have been completed. Thank you.

DorAmram commented 4 years ago

Hello I can take

Thanks

kpmccahill commented 4 years ago

Hello, I'll take

Thanks!

EydenVillanueva commented 4 years ago

I took:

Link to my pr: https://github.com/pandas-dev/pandas/pull/30278

Jcole429 commented 4 years ago

I'll work on:

Here because of the tag "good first issue"

CortlandMorse commented 4 years ago

I'll take:

Thanks!

baevpetr commented 4 years ago

I'll do:

hasnain2808 commented 4 years ago

i would like to take i just now did a pull request on it

JMBurley commented 4 years ago

Great, always wanted to chip in on Pandas. Will update when I know what I can fulfill in the next few weeks...

DorAmram commented 4 years ago

I can take pandas/_version.py

AncientRickles commented 4 years ago

Jumping on:

sardonick commented 4 years ago

I'll take pandas/io/formats/csvs.py

JMBurley commented 4 years ago

I'm taking:

JMBurley commented 4 years ago

Question: Is there an answer on whether old-school string formatting should remain in the API reference?

For example, in series.map():

 It also accepts a function:

        >>> s.map('I am a {}'.format)
        0       I am a cat
        1       I am a dog
        2       I am a nan
        3    I am a rabbit
        dtype: object

Replacing this with an f-string example forces something like:

      >>> s.map(lambda x: f'I am a {x}')
        0       I am a cat
        1       I am a dog
        2       I am a nan
        3    I am a rabbit
        dtype: object

Which is not an exact replacement (f-string is not a function, I'm using the lambda to make it a function that replicates the net effect of 'I am a '.format() ), and raises some thorny issues about putting some not-best-practices in documentation.

For now, I am considering changing the documentation as out-of-scope, pending community decision on how to handle cases like this.

JMBurley commented 4 years ago

Already done (not sure by whom): pandas/tests/arrays/interval/test_ops.py

I'll also take

ShaharNaveh commented 4 years ago

Question: Is there an answer on whether old-school string formatting should remain in the API reference?

cc @WillAyd @jreback

WillAyd commented 4 years ago

I don't know what you consider the "old-school string format" to be but .format will have some use cases that f-strings don't cover (namely delayed parametrization) so sure that will still be around. I don't think we should have Py27 string format syntax anywhere though

jlamborn324 commented 4 years ago

I'lll take:

thepaullee commented 4 years ago

I'll take