Extraneous if statement expression while refactoring

diceroll123 commented 2 years ago

Issue description or question

def test():
    # suggests the following:
    # Merge duplicate blocks in conditional (merge-duplicate-blocks)
    # Remove redundant conditional (remove-redundant-if)
    blah = 3
    other = True
    yes = []
    no = []

    if blah > 3:
        if other:
            no.append(1)
        else:
            yes.append(1)
    elif blah < 3:
        if other:
            yes.append(1)
        else:
            no.append(1)

    return yes, no

The code above, after refactoring, will produce the following if statement (removed the rest of the code to save space and to focus):

if blah > 3 and other or blah <= 3 and blah < 3 and not other:
    no.append(1)
elif blah > 3 or blah < 3:
    yes.append(1)

I'm not too sure where the or blah <= 3 came from! That threw me off pretty good. To clarify, it doesn't break this case, I caught it just by visuals.

If I'm incorrect in assuming that this is in error, please let me know!

Thanks for readin'.

Sourcery Version

v0.11.4

Code editor or IDE name and version

VSCode

Version: 1.67.0 (user setup) Commit: 57fd6d0195bb9b9d1b49f6da5db789060795de47 Date: 2022-05-04T12:06:02.889Z Electron: 17.4.1 Chromium: 98.0.4758.141 Node.js: 16.13.0 V8: 9.8.177.13-electron.0 OS: Windows_NT x64 10.0.25115

diceroll123 commented 2 years ago

This is a slimmed down version of something that ended up being refactored into like, seven and+or's total in one if statement. This leads me to ask if there should come a point where run-on if statements should be given with parenthesis for legibility purposes.

ruancomelli commented 2 years ago

Hello, @diceroll123! Ah yes, this refactoring also surprises me a bit once in a while :laughing:

What you described is indeed the intended behavior for this refactoring. Sourcery collects all possible paths for us to reach each of the repeated bodies. For instance, there are two ways for us to reach no.append(1):

the first one is straightforward: both blah > 3 and other evaluate to true, hence the blah > 3 and other bit in the final conditional;
but we can also reach the second no.append(1). In order for that to happen, we need that:
1. the first if evaluates to false: blah > 3 is false - hence blah <= 3
2. the elif evaluates to true: blah < 3 is true;
3. other evaluates to false;
In your code:
```
if blah > 3: # <- (i) False
    ...
elif blah < 3: # <- (ii) True
    if other: # <- (iii) False
        ...
    else:
        no.append(1) # <- Yay, we got here!
```

The second chain results in blah <= 3 and blah < 3 and not other. The blah <= 3 bit came from the fact that the first conditional (blah > 3) must be false: not blah > 3 -> blah <= 3.

This leads me to ask if there should come a point where run-on if statements should be given with parenthesis for legibility purposes.

Oh yes, some parentheses would probably help make things clearer, especially for longer chains of conditionals. I added this idea to our roadmap.

Additionally, your example could also be further improved with the following extra refactorings:

blah <= 3 and blah < 3 -> blah < 3
blah > 3 or blah < 3 -> blah != 3

Both were also added to our roadmap :smile:

Please let me know if the explanation above makes sense, and thanks for opening this issue!

sourcery-ai / sourcery