hpc / charliecloud

Lightweight user-defined software stacks for high-performance computing.
https://hpc.github.io/charliecloud
Apache License 2.0
308 stars 61 forks source link

add `pylint` to test suite #1860

Open lucaudill opened 4 months ago

lucaudill commented 4 months ago

Since I started using VSCode, I've noticed that it's easy to add unused imports to our Python code without noticing. Adding a Python linter (e.g. pylint) to our test suite could help catch those instances and other potentially bad coding practices.

One thing worth noting is that pylint really doesn't like our code, e.g.

$ pylint --indent-string="   " lib/*py | wc -l
1656

Here our code is given a rating of 5.38/10 ☹️

pylint messages have a naming convention based on the following categories (seemingly in descending order of severity): Fatal, Error, Warning, Convention, Refactor, Information. The breakdown of the messages we get is as follows:

So the vast majority of complaints pylint gives us fall under the "Convention" category. Some common messages that show up for us are:

This is just a sample, there are obviously many more. I think step 1 for whoever takes this on is to go through the list of messages and figure out what we want to ignore and what we don't.

lucaudill commented 4 months ago

Here's a text file with the full output of

$ pylint --indent-string="   " lib/*py

pylint.txt

reidpr commented 4 months ago

I propose we do the following.

General comments / notes

Dispositions (what to do)

I went through the 1,641 messages in the text file above and ended up boiling it down to 75 groups with the same disposition. (Note: I would call these “warnings”, but Pylint calls them “messages”, with “warning” being a specific category of message.) These dispositions fell into the six classes below.

I’d like the initial PR for this to yield no messages. We’ll do that by fixing the easy stuff in that PR and deferring the rest.

Now

Each of these is described in an individual comment below. These should all be fixed in this issue’s PR.

  1. Ignore globally forever. I just disagree with Pylint, so let’s turn off the warning for the entire project. This configuration happens in the same PR that implements Pylint.

  2. Fix now. Real problems that should be fixed and are important/easy enough to do in the same PR that implements Pylint.

  3. Case-by-case ignore or fix now. Each of the individual messages needs to be examined and either ignored (with a Pylint exclusion comment) or fixed in the original Pylint PR.

Later

Each of these has its own issue, linked below.

  1. Enforce the opposite. I actively want the opposite of Pylint, and because Pylint is extensible, it can enforce that. I suspect that for many of these, we can use the existing inverted checker for reference.

  2. Fix later. Real problems where we should do something, but it’s hard enough that a new PR is needed.

  3. Unclear. I don’t know what to do.

I wonder if a good approach is to disable all the messages we get (I have a spreadsheet that can give a list), e.g. with --disable X001,X002,... or configuration, then one-by-one re-enable the messages below in this issue. After that the disabled list should be exactly what we decided to defer.

reidpr commented 3 months ago

Class names (C0103 invalid-name)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L202

We just use a different naming convention. Pylint doesn’t have a pre-configured style name but it should be easy to configure via regex.

reidpr commented 3 months ago

Log function names (C0103 invalid-name)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L467-L469

These functions just have special names. Probably just add ignore comments to each def.

reidpr commented 3 months ago

Bad None comparison (C0121 singleton-comparison)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L785

Should be is None.

reidpr commented 3 months ago

Bad type check? (C0123 unidiomatic-typecheck)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/registry.py#L88-L89

I think this is OK because this class has a weird notion of equality, but needs a little more though.

reidpr commented 3 months ago

Bad range() (C0200 consider-using-enumerate)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L799-L801

These can be easily refactored to not use indexes.

reidpr commented 3 months ago

Non-idiomatic class argument (C0202 bad-classmethod-argument)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L287-L288

We use class_ but it’s easy to just change to the idiomatic cls.

reidpr commented 3 months ago

Bad __slots__ (C0205 single-string-used-for-slots)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L611

These should all be fixed.

reidpr commented 3 months ago

Unnecessary splitting (C0207 use-maxsplit-arg)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L411

This appears to be a performance thing and we don’t use this in places where it matters. I think what we have is more readable, so let’s global ignore.

reidpr commented 3 months ago

Modules too long (C0302 too-many-lines)

This probably right, but I don’t see this changing in the foreseeable future and I don’t think Pylint is the right place to enforce it. If the warnings are all solved, then the most likely way it would re-appear is that a small change pushes a module over the threshold, which is the wrong time to demand a refactor.

Global ignore.

reidpr commented 3 months ago

Multiple statements per line (C0321 multiple-statements)

I don’t mind this and we use it a lot:

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L50

but:

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/image.py#L633-L635

or:

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/image.py#L839-L842

seem confusion-prone. Let’s just stop using multiple statements per line.

reidpr commented 3 months ago

Bad import position (C0413 wrong-import-position)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L26-L33

We have a good reason for this, but probably better to disable on a case-by-case basis.

reidpr commented 3 months ago

Inappropriate dunder call (C2801 unnecessary-dunder-call)

https://github.com/hpc/charliecloud/blob/5ceefa7040c0accb91acf8b2c99e0f7ec6a97429/lib/filesystem.py#L393

False positive; we’re implementing os.PathLike. Suppress case-by-case.

reidpr commented 3 months ago

Bad except order (E701 bad-except-order)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/image.py#L751-L759

This is a real bug that should be fixed.

reidpr commented 3 months ago

Nonexistent instance member access in external classes (E1101 no-member)

This happens twice:

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L384-L385

Genuinely wrong and will crash the error message.

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L737

This one seems like if it’s not present it shouldn't work at all? Investigate.

reidpr commented 3 months ago

Non-iterable iterated (E1133 not-an-iterable)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/registry.py#L118

The specific error is a false positive, but it’s a property that need @abc.abstractmethod I think.

reidpr commented 3 months ago

Bad % (E1305 too-many-format-args)

https://github.com/hpc/charliecloud/blob/5ceefa7040c0accb91acf8b2c99e0f7ec6a97429/lib/filesystem.py#L763

Yep, that's wrong.

reidpr commented 3 months ago

Merge isinstance() (R1701 consider-merging-isintance)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L77-L79

Should probably fix.

reidpr commented 3 months ago

FP complaints about return on ch.FATAL (R1710 inconsistent-return-statements)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L648-L653

This is a FP because ch.FATAL doesn’t return (it goes into an exception procedure that kills the program). Can we teach Pylint that it doesn’t return?

reidpr commented 3 months ago

Useless return complaints (R1711 useless-return)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L1387-L1389

The return is indeed useless but in this particular case I like it there because related classes’ commit() does return something meaningful, which also might be None. Perhaps looks at these and ignore on a case-by-case basis?

reidpr commented 3 months ago

Built-in exit() re-defined (R1722 consider-using-sys-exit, W0622 redefined-builtin)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L638-L639

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L661-L664

This is the same problem. Rename our exit(), and see if we are calling sys.exit() anywhere we should be using our own exit().

reidpr commented 3 months ago

Use error-checking ch.open_() (R1732 consider-using-with)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L1108

with isn’t a good match for Charliecloud because the key operations are wrapped so they can’t fail. That is, I don’t think we should take the advice, but the flagged code should probably be using fs.Path.open() instead and the warning should be left enabled.

reidpr commented 3 months ago

Bad keyword argument defaults (W0102 dangerous-default-value)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L528-L529

I agree. Fix.

reidpr commented 3 months ago

Needless pass (W0107 unnecessary pass)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L449-L452

IMO clearer this way. Ignore globally.

reidpr commented 3 months ago

FP on self.__class__ (W0212 protected-access)

https://github.com/hpc/charliecloud/blob/5ceefa7040c0accb91acf8b2c99e0f7ec6a97429/lib/filesystem.py#L814

Technically this a false positive — we’re accessing a private class (not instance) member — but going though self.__class__ seems weird. Why not just self?

reidpr commented 3 months ago

Superclass __init__() not called (W0231 super-init-not-called)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L1374-L1377

This is on purpose b/c the inheritance is a bit strange. I’d like to leave it enabled and justify exceptions on a case-by-base basis.

reidpr commented 3 months ago

Superclass argument renamed (W0237 arguments-renamed)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L286

Personally I think value is a bad name but this is probably right, less confusing if we use the same name.

reidpr commented 3 months ago

Method that only delegates back to parent (W0246 useless-parent-delegation)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/image.py#L928-L929

Indeed this method seems to be doing nothing. Solution is probably delete it regardless, but I really wonder why.

reidpr commented 3 months ago

Bad indentation (W0311 bad-indentation)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L827-L829

This whole method is indented too much, probably due to being moved at some point.

reidpr commented 3 months ago

Needless semicolons (W0301 unnecessary-semicolon)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L639-L640

Somebody forgot they were writing Python instead of C. Fix.

reidpr commented 3 months ago

Global variable introduced secretly (W0601 global-variable-undefined)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L666-L668

This is where xattrs_save is defined. We should be able to see a complete list of globals at the top of the file.

reidpr commented 3 months ago

Remove unused imports (W0611 unused-import)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L5

I think unused imports might have been the original motivation for this entire issue.

reidpr commented 3 months ago

Remove unused variables (W0612 unused-variable)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L753

The idiom in this particular case is IIRC for _ in range(timeout*2). In general, I agree, unused variables seems bad.

reidpr commented 3 months ago

Undefined loop variables? (W0631 unused-loop-variable)

https://github.com/hpc/charliecloud/blob/5ceefa7040c0accb91acf8b2c99e0f7ec6a97429/lib/force.py#L373

Here it is defined. Should we simplify the control flow? I don’t know. Probably analyze on a case-by-case basis.

reidpr commented 3 months ago

Catch-all *args after keyword args (W1113 keyword-arg-before-vararg)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L998

This can cause duplicate argument exceptions. Fix.

reidpr commented 3 months ago

Non-string default in os.getenv() (W1508 invalid-envvar-default)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/image.py#L837

In this particular case it’s harmless because we immediately convert to int anyway, but probably an antipattern. Fix.

reidpr commented 3 months ago

subprocess.run() with implicit check (W1510 subprocess-run-check)

https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L575

Sure, fix.