MichaelKim0407 / flake8-use-fstring

MIT License
51 stars 7 forks source link

Add error for forgetting the 'f' #1

Closed plinss closed 4 years ago

plinss commented 4 years ago

When converting code from using .format() to using f-strings, an easy mistake to make is forgetting to add the 'f' to the beginning of the string, which can cause all sorts of stealth bugs.

An error when string literals that are not f-strings include {}'s would be useful.

I understand that this will trigger false positives, but we can either use noqa for those (preferred, I think) or turn the error on when converting legacy code to using f-strings and leave it off otherwise.

MichaelKim0407 commented 4 years ago

Thanks for the suggestion. I can't promise when I can get to it, so if you really want it please feel free to submit a pull request.

I would make it work like this:

  1. Report error if { or } appears in the string but missing f, but don't report {{ or }} (an even number of a symbol consecutively).
  2. Report error if f is used but no { or } appears in the string - this will enforce the developer to be explicit about the intension of formatting, instead of just using f-strings everywhere.
MichaelKim0407 commented 4 years ago

Also for consistency the default setting should not trigger any false positive - which means point (1) should be off. We can probably make two different errors

plinss commented 4 years ago

Hi @MichaelKim0407, thanks for the quick response. I was just sitting down to start working a PR... I've been converting a bunch of code from .format() to f-strings and I know I've missed a few fs already. Caught them during manual cross-checks, but I'm paranoid about the ones I may have missed.

In your point (1), I presume the error should be if { and } appears in a string without an f prefix as a balanced set. Either by themselves should be OK as a literal brace in a string (I think). And unbalanced pairs are illegal in f-strings so we wouldn't want to encourage conversion in that case. I'll also ignore any strings followed by .format or % as those will be handled by the other checks.

I was planning on making this a distinct error and I'm fine with it off by default, I'll add a config option to turn it on. FS003: f-string missing prefix ok with you, or would you prefer another code? And yeah, I'm aware of {{ being a literal { so was already planning on ignoring those.

Your second point is already handled by flake8 natively: F541 f-string is missing placeholders, so I don't see any point in implementing that here.

MichaelKim0407 commented 4 years ago

In your point (1), I presume the error should be if { and } appears in a string without an f prefix as a balanced set. Either by themselves should be OK as a literal brace in a string (I think).

I was just thinking what would be easy to implement, but if you also want to check the balance, that is cool.

I'll also ignore any strings followed by .format or % as those will be handled by the other checks.

I'm not quite sure about this - my gut feeling is that each error should be logically independent so that anyone can be turned on/off without affecting others.

This can be especially problematic since the other two errors have greedy level options, so if we want this new error to respond to how the other two errors are configured, we will end up with a big if-else statement..

Also reporting two errors on one line should be totally fine - in fact flake8 does it this way (I think?), or at least it's how it wants plugins to behave.

But feel free to challenge my idea here.

FS003: f-string missing prefix

I don't have a preference, so please feel free.

F541 f-string is missing placeholders

Yeah I didn't realize this is already covered.

plinss commented 4 years ago

I'll also ignore any strings followed by .format or % as those will be handled by the other checks.

I'm not quite sure about this - my gut feeling is that each error should be logically independent so that anyone can be turned on/off without affecting others.

My thinking is that an existing ''.format() or '' % ... will already be tripping the other checks if they're enabled. It doesn't make sense to me to also say that they're 'f-strings missing a prefix' since they're not f-strings.

In my mind the intention of this check is to find something that was intended to be an f-string, but isn't because the author forgot the f. A ''.format() or '' % doesn't seem to be intended to be an f-string so why complain about the lack of a prefix?. Also, if someone adds a # noqa FS002 to keep their .format(), they shouldn't also have to add FS003 to stop if from telling them to add an f too (which would be an error if they did because it'll evaluate the f-string before passing it to .format() and they may get very different results. I'm trying to catch bugs, not encourage people to create new ones because they missed something subtle and just followed the linter's advice).

I think this actually serves your goal and keeps this check logically independent of the others. To me it's really an either-or situation, I don't see the value of reporting both because only one problem is going to be present at a time. You either have an f-string, or you don't, FS001/FS002 are for when you don't, FS003 is for when you do.

FWIW, I already have this part written, feel free to delete it (or add a switch to turn it off) if I haven't convinced you (and you get a chance to play with it).

This can be especially problematic since the other two errors have greedy level options, so if we want this new error to respond to how the other two errors are configured, we will end up with a big if-else statement..

The check I'm writing only looks at string literals, so it doesn't care about the greedy level of the other checks. Those will always look at string literals at all greedy levels if I'm understanding correctly so I think their greedy level is irrelevant to this check.

(FWIW, I'm not really seeing the value of the other greedy levels, I can't replace the behavior of foo.format(...) using an f-string when foo was passed into my function, if I'm missing something please enlighten me...)

MichaelKim0407 commented 4 years ago

I'm not really seeing the value of the other greedy levels, I can't replace the behavior of foo.format(...) using an f-string when foo was passed into my function, if I'm missing something please enlighten me...

OK I think I realized why we think differently. You are trying to rewrite old code, whereas I'm trying to create a linter that says "don't do this". You are right that such a line can't be easily replaced without changing the behavior, but imagine you are writing a new project you shouldn't want this behavior in the first place.

plinss commented 4 years ago

It's fine for opinions to differ, that's what configuration options are for. There are also a whole set of use cases that neither of us can anticipate.

FWIW, the FS003 check is about adapting code, but both for the 'let's modernize old code' use case as well as the coder who's used to writing '{}'.format(), does it once, and now sees a linter error and has to fix that one line of new code, they can still get it wrong and have a hard to find bug.

As far as the 'don't do this', IMO there are still valid use cases for .format, so long as you're not using a string literal. As an example, a SQL DB connector that takes a query string and a set of arguments. The arguments needs to be escaped in a server-specific way before being passed to .format() on the query string. You don't want to burden the caller with escaping because they'll get it wrong and this will lead to SQL injection attacks. AFAICT there's no way to do that with an f-string (which is why I asked for enlightenment if I'm missing something there, if there's a better way, please teach me).

I totally accept if someone wants to ban .format or % entirely, so I don't have an issue with your greedy levels, I just didn't see a use for them personally and was wondering about their purpose (as an aside). I wasn't challenging the levels or suggesting you remove them, I was just trying to understand the motivation that led to them and trying to learn something new. E.g. in my SQL example, how else would you do it?

I suppose it does make sense to turn on only one of the greedy levels, e.g. you can use .format on variables or function returns, but you can never use % anywhere.

And anecdotally, now that I've had a chance to run FS003 against a significant codebase that I recently switched from .format to f-strings, it did catch an instance my manual triple-checking missed, I had put the 'f' inside the quotes and it didn't stand out in a visual inspection. There were also a significant amount of false-positives, as I expected. The three types of cases where they appeared were: regexes; the aforementioned SQL queries, where the string literal is input to a function where it will eventually be passed to .format after the arguments are escaped; and input to RFC6570 URI Templates, which use a similar syntax for expansions but are not processed by .format().

FWIW, even though flake8 is seeing FS003 as off by default (a -vv run shows it added to the default ignore list and it doesn't show in local testing), it was still outputting the errors on the other codebase. The debug output showed it deciding to override the default ignore and I don't understand why, and couldn't find any config turning it on. So that method of disabling doesn't seem to be entirely reliable... (adding an explicit FS003 to the ignore list did work). I'm wondering if another plugin is doing something...

plinss commented 4 years ago

And anecdotally, now that I've had a chance to run FS003 against a significant codebase that I recently switched from .format to f-strings, it did catch an instance my manual triple-checking missed, I had put the 'f' inside the quotes and it didn't stand out in a visual inspection.

Which to me, the code already served it's purpose, so IMO my effort was well spent. I won't be offended if you decide you don't feel this functionality is appropriate for this plugin, I just thought I'd give back. If you'd rather not have it, I can always release it as a stand-alone plugin for anyone who wants it (just the FS003 check by itself, and I'll change the error code to not step on your plugin of course).

MichaelKim0407 commented 4 years ago

That message was written yesterday but I didn't hit send.. It's an explanation why greedy levels are there, despite you not finding them helpful. Since you are rewriting old code I agree indeed they won't be very helpful, but this plugin is not only for rewriting code.

I fully intend to accept your PR, but as I mentioned I just wanted to do it when I have more time to take a better look.

MichaelKim0407 commented 4 years ago

@plinss released 1.1 on pypi.org