cms-sw / cms-bot

A few scripts to automate approval / testing process
28 stars 244 forks source link

Discussion: Aligning Python Code Formatting with PEP8 vs. Readability Preferences #2311

Closed Benedikttk closed 1 month ago

Benedikttk commented 1 month ago

As previously discussed in Core Software meeting on 30/Jul/2024 (indico link), we would like to address the issue of the Python formatting style. The tool we are using for formatting is Ruff and it enforces PEP8 formatting.

We have seen some pieces of code formatted by the authors for readability in a specific way. For example, in Validation/SiTrackerPhase2V/python/Phase2TrackerValidateDigi_cff.py:

    TOFRMapH = cms.PSet(
        Nxbins = cms.int32(1200),
        xmin   = cms.double(0.),
        xmax   = cms.double(120.),
        Nybins = cms.int32(100),
        ymin   = cms.double(0.),
        ymax   = cms.double(50.),
        switch = cms.bool(False)
    )

In this case, Ruff reports that all those white spaces do not comply with PEP8 and they will be automatically changed to:

    TOFRMapH = cms.PSet(
        Nxbins=cms.int32(1200),
        xmin=cms.double(0.),
        xmax=cms.double(120.),
        Nybins=cms.int32(100),
        ymin=cms.double(0.),
        ymax=cms.double(50.),
        switch=cms.bool(False)
    )

We can find another example of readability vs formatting in GeometryReaders/XMLIdealGeometryESSource/test/testReadXMLFromDB.py, where the author aligned the input parameters as follows:

process.MessageLogger = cms.Service("MessageLogger",
cerr = cms.untracked.PSet(
    enable = cms.untracked.bool(False)
),
debugModules = cms.untracked.vstring('*'),
files = cms.untracked.PSet(
    readDBdebug = cms.untracked.PSet(
        DEBUG = cms.untracked.PSet(
            limit = cms.untracked.int32(-1)
        ),
        INFO = cms.untracked.PSet(
            limit = cms.untracked.int32(-1)
        ),
        extension = cms.untracked.string('.out'),
        noLineBreaks = cms.untracked.bool(True),
        threshold = cms.untracked.string('DEBUG')
    ),
    readDBerrors = cms.untracked.PSet(
        extension = cms.untracked.string('.out'),
        threshold = cms.untracked.string('ERROR')
    )
)

This will be reformated to:

process.MessageLogger = cms.Service(
    "MessageLogger",
    cerr=cms.untracked.PSet(enable=cms.untracked.bool(False)),
    debugModules=cms.untracked.vstring("*"),
    files=cms.untracked.PSet(
        readDBdebug=cms.untracked.PSet(
            DEBUG=cms.untracked.PSet(limit=cms.untracked.int32(-1)),
            INFO=cms.untracked.PSet(limit=cms.untracked.int32(-1)),
            extension=cms.untracked.string(".out"),
            noLineBreaks=cms.untracked.bool(True),
            threshold=cms.untracked.string("DEBUG"),
        ),
        readDBerrors=cms.untracked.PSet(
            extension=cms.untracked.string(".out"),
            threshold=cms.untracked.string("ERROR"),
        ),
     ),

Some people prefer to maintain readability in their code, which may not always align with PEP8. Currently, it is not possible to selectively retain or discard specific formatting elements to accommodate individual coding styles. So we can either format all white spaces according to PEP8 or none.

We are opening this issue seek your feedback on this. I am open to suggestions to progressively format the Python code in CMSSW without disturbing the contributors.

Thank you

cmsbuild commented 1 month ago

cms-bot internal usage

cmsbuild commented 1 month ago

A new Issue was created by @Benedikttk.

@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

aandvalenzuela commented 1 month ago

See discussion at https://github.com/cms-sw/cmssw/issues/45706

Dr15Jones commented 1 month ago

So as food for thought, the very beginning of PEP 8 says when NOT to follow its convention

Some other good reasons to ignore a particular guideline:

1. When applying the guideline would make the code less readable, even for someone who is used to reading code that follows this PEP.
2. To be consistent with surrounding code that also breaks it (maybe for historic reasons) – although this is also an opportunity to clean up someone else’s mess (in true XP style).
3. Because the code in question predates the introduction of the guideline and there is no other reason to be modifying that code.
4. When the code needs to remain compatible with older versions of Python that don’t support the feature recommended by the style guide.
Dr15Jones commented 1 month ago

So looking at examples in the PEP, it looks like the formatter is not actually obeying the rules. In the examples in the PEP it has

# Correct:
x = 1
y = 2
long_variable = 3

and explicitly says

Always surround these binary operators with a single space on either side: assignment (=), augmented assignment (+=, -= etc.), comparisons (==, <, >, !=, <>, <=, >=, in, not in, is, is not), Booleans (and, or, not).

which the examples you gave from this formatter does not do.

smuzaffar commented 1 month ago

@Dr15Jones , please use https://github.com/cms-sw/cmssw/issues/45706 for more visibility