Open webknjaz opened 3 years ago
@webknjaz Will highly appreciate PR for these, I could learn new things :). I am stuck in classic vanilla setup.py workflows and seemingly things have evolved a lot in this aspect. Thank you 👍
Woah! Among other problems, there are no wheels uploaded for the last versions on PyPI (since 2019!)
First improvement done: https://github.com/abhinavsingh/proxy.py/pull/647.
@abhinavsingh I've submitted a number of the linting setup maintainability improvements. The last one includes everything but I thought it would probably be easier/more digestible if the building blocks were their own atomic PRs:
This setup is rather strict but has a lot of excludes in the config because there's a lot of the linting violations that get revealed otherwise. I think it'd make a nice series of "good first issues" for beginner contributors to improve.
@abhinavsingh I wanted to ask you if you have a strong preference to have multiple testing workflows. It feels easier to have the testing jobs in the same workflow because then it's possible to create useful dependencies between them.
Another suggestion would be to make the workflow and job names shorter. Ever since GHA got introduced many projects noticed that big job names don't fit in the UI in multiple places on the website making them rather useless. From this, practice to keep them short emerged. I hope you don't mind if I try to come up with better names.
Uncommented another fixer on top: https://github.com/abhinavsingh/proxy.py/pull/661
And here's two more improvements:
I'm starting to hit merge conflicts so I'll pause for some time to give you a chance to get these merged before I add more.
@webknjaz Yes naming shouldn't be an issue. My original intention here was to surface enough context via job names. Checking on the PRs.
Yeah I see that. But currently you repeat the same context in many places. Job names show up right after the workflow names in all of the UIs. Plus adding the project name is not particularly useful since when you look at the CI pages, it's visible what repo they are in.
@abhinavsingh here are fixes for the pre-commit.ci
env:
@webknjaz Locally I ran into PyLint
socket module related errors. Amazingly, same check passes on GHA (and I guess even for you locally). So am I missing something locally here. There are variety of suggestions on StackOverflow. Any hints here?
@abhinavsingh this is interesting... are you on macOS? Locally, I run Gentoo Linux and GHA and pre-commit.ci
both use Ubuntu, I think. But yeah, pylint is known for behaving inconsistently sometimes, according to the env it runs under (including the Python version because some of its checks tend to be semi-dynamic as opposed to flake8 which is fully static). It can be a little annoying. I'll see what I can do. Worst case, we can disable the specific rule.
https://github.com/PyCQA/pylint/issues/4759#issuecomment-890564379 seems to suggest that Python shipped by homebrew
tends to be problematic. Usually, I use pyenv
(for many reasons) — it allows me not to mess with the system installation of CPython which may be patched or influence various components in my system. I specifically use the userspace installation so it all lives in my ~/.pyenv
and even if I destroy it completely, it won't affect important parts of the system. I can delete and recreate it any time + it allows me to have as many interpreter versions as I like.
Yep, I am on macOS
. Lemme give it a try locally with pyenv
. We might need to highlight this fact somewhere for macOS
contributors or may be simply relax pylint
for socket classes (which kind of is covered via mypy)
Well, we could apply inline disables for those cases. Or maybe add a transform plugin https://pylint.pycqa.org/en/latest/how_tos/transform_plugins.html. But I'd argue that this is a band-aid and would just shadow the real issue of having a bad CPython env, and people should be encouraged to fix their dev envs (which is probably what this false-positive really reveals).
So here's some job/workflow name updates making them mostly fit in the UI:
@webknjaz I checked random repositories on Github and looks like along with setup.cfg
, repos are also adding a minimal setup.py
from setuptools import setup
setup()
See https://github.com/search?q=filename%3A.pyup.yml+setup.cfg&type=code for reference. I searched this to understand why pyup
broke after setup.py
removal. This seems the likely reason. Wdyt?
setuptools supports not having setup.py
for years now. I've been deleting it from all my projects since then. The only real problem was that pip install -e
didn't work without it, this is why many projects adopted a minimum file stub. But that's been fixed in the recent releases so it's not a concern anymore. The concern is providing a way to bypass PEP 517 by having setup.py
in the project.
As for pyup, I don't know. You could probably configure it to exclude setup.py
from its checks. It's a bad idea to pin the direct deps in package meta anyway.
Thank you, makes sense. I haven't kept up with changes of late.
I also observed codecov
integration is broken. Looking at the GHA logs, upload coverage step fails with "no coverage file found". Though, I see coverage files being generated just fine.
Any hints what might have gone wrong here. Root directory context doesn't change, so .coverage
should still be auto-discoverable.
Here's a doc on the config file: https://pyup.io/docs/bot/config/#specifying. Or you could just migrate to Dependabot like others did.
I think that the official codecov GHA looks for an XML report that pytest does not currently produce. I was going to fix it by add --cov-report=xml
but haven't got to it.
Here's a doc on the config file: https://pyup.io/docs/bot/config/#specifying. Or you could just migrate to Dependabot like others did.
Will deprecate pyup
for dependabot. I was just curious about what happened that it broke.
I think that the official codecov GHA looks for an XML report that pytest does not currently produce. I was going to fix it by add
--cov-report=xml
but haven't got to it.
Yep this probably should fix it :). Thank you
@webknjaz I realized that flake8
and pylint
now use 100% of all available cores. This typically happens for 5-10 seconds and is noticeable due to increase in fan speed. I am wondering if there is something we can (must) do about it. Thank you.
Oh, this reminds me that I forgot to increase the number of processes for pylint. flake8 already has auto
set by default but pylint's config just sets a single job.
It's usually best to use all the cores available to have better responsiveness. This will mean faster CI job and quicker results on the dev laptops.
A few more forgotten bits:
Cool, I hope pylint
config update will reduce the time for which CPUs are in a tight spin loop. I don't see a problem in utilizing all cores, but spinning up all available 8 cores for 10 seconds on a developer box is indeed not necessary. At least a choice must exist :). Currently I experience my MacBook fan spinning up for quite sometime after every flake8
and pylint
run.
Actually, per comment in the config 0
is auto
. Meaning it'll utilize more cores. I suppose you could run these tools manually and pass a different value to --jobs=
in CLI.
Import loop checker:
Coverage.py config:
A stricter dedicated pytest config:
Enhancement to what's tested in the CI vs what makes its way to the PyPI:
Pytest config hotfix:
I noticed that setuptools emits a barely noticeable warning about multiline package descriptions, so here's the fix:
@abhinavsingh how do you feel about replacing the hardcoded package version with one inherited from Git? Including versioning for the intermediate commits. I normally use setuptools-scm for this. I thought I'd set up GHA-based publishing and it'd be quite useful to have the version non-hardcoded.
@abhinavsingh how do you feel about replacing the hardcoded package version with one inherited from Git? Including versioning for the intermediate commits. I normally use setuptools-scm for this. I thought I'd set up GHA-based publishing and it'd be quite useful to have the version non-hardcoded.
I did envision it https://github.com/abhinavsingh/proxy.py/issues/102 but never got to it. Will be a bomb to have this. I'll be happy to have even more regular releases (which are not blocked on me personally). E.g. last version was released about 6-month back :(
@abhinavsingh alright. What's your release flow? Right now, having https://github.com/abhinavsingh/proxy.py/pull/682 merged, you can download artifacts (wrapped as one archive with GHA) from the workflow run, unpack them and publish with twine upload
. Do you normally have preparation activities besides re-hardcoding the version, like composing a changelog? If not, this would be quite easy to implement with two PRs.
I must warn that for Git-based versioning to work properly, in cases when the tag is first on another branch, that branch must be merged with a proper merge commit, not squash, not rebase (otherwise, the tag won't get into the other branch — develop/master/main/whatever — and git describe
won't be able to discover the proper tag).
One preparation activity is related to brew
where stable version is hardcoded within .rb
file. This happens because advertised brew formula for both stable
and development
versions points to the develop
branch. One option could be to have a single brew formula. One in the master branch points to stable version and the formulae in develop branch always points to the development version.
PS: Last I checked, brew formulae are not even working correctly, something broke in last 6-months. I think brew now expects cask
formulae and not a package formulae class. Of-course, this is only intended for macOS users and I am not even sure how many folks are using proxy.py
via brew. Outside of brew
most of the things are kept in sync during development i.e. develop
branch is always ready for publish a.k.a. merge into master
branch.
Typically, if all looks good locally, I follow these steps:
develop
-> master
vX.Y.Z
as PR description (we must be using tags of-course)make lib-release-test
make lib-release
In unfortunate circumstances, Step-4.2
may fail. A patch PR is sent into the develop branch to address the issue and process is repeated from Step-3
.
Currently, no separate CHANGELOG
file is shipped, in-fact README
contains minimal major-version change log in the footer. vX.Y.Z
PR description works as CHANGELOG
which is updated manually. These PRs are easily discoverable, though currently not linked anywhere (e.g. tags/releases). README
footer change log is updated to advertising major-version change reasons (e.g. architecture change, threaded, threadless etc).
For proxy.py
scenario, release workflow can be triggered for PRs from develop
-> master
. Workflow can follow Step-4
. Workflow can re-use PR title/description for tag name/description. Wdyt?
Actually, per comment in the config
0
isauto
. Meaning it'll utilize more cores. I suppose you could run these tools manually and pass a different value to--jobs=
in CLI.
I missed taking the screenshot at right time, otherwise User: 99%
was what I saw. This is my macbook CPU usage in last 2-minutes. I ran make lib-lint
3 times within this period, indicated by peaks.
IMHO, we'll need to address this at some point. I understand utilizing all cores, but I'll call this PyLint behavior as spinning all cores
. Problem is, it doesn't even finishes fast, takes minimum of ~10-seconds.
Also, have you seen how weirdly PyLint
behaves if a mypy
check has failed previously, PyLint
kind of just hangs for 10-20 seconds before resuming.
From process manager I see PyLint
goes on to spawn 100s of processes. Ideally, PyLint
must use a strategy similar to proxy.py
, maintain one loop per CPU and do things asynchronously within it, instead of spawning 100s of processes, all consuming 30-70% of CPU.
Another side-effect of PyLint
is, we are exhausting GHA quota's more quickly. And I am not surprised looking at the way PyLint
is behaving. My laptop literally becomes unresponsive. I fear running that PyLint
workflow locally, haha
I looked on GitHub and found several pull-requests into repos where they have replaced PyLint
with flake8
for exactly this reason (high CPU usage).
Typically, if all looks good locally, I follow these steps:
- Make a PR from
develop
->master
- Use
vX.Y.Z
as PR description (we must be using tags of-course)- Ensure PR is merge ready (GHA workflows are passing)
Then locally on my laptop,
make lib-release-test
- Install from test PyPi and do a sanity check
make lib-release
- Merge the PR
Thinking more on it:
Step-4.2
is simply re-running the e2e
workflows with test PyPi distribution.Approve
button on the PR
before executing Step-4.3
i.e. release to PyPi. This can be used as an opportunity to do a manual sanity check. At-least unless this specific workflow matures and we gain trust in it. Later on we can short circuit the Approval step. Just throwing out ideas here, I am unsure how easy is to add "Approval" button on PR via workflows. I see GitHub prompting me to "Approve and Run" workflows for first time contributors.About version names, I am happy to even get rid of SemVer in favor of simple DateTime based versioning system. SemVer while great for conveying major-minor-patch version, but is a PITA to maintain and onus lies on the authors/developers/maintainers to keep it aligned. At-least datetime based versioning will automate everything related to versioning & tags.
Though I am unsure how popular is the datetime based versioning system in the community OR what are the side-effects of migrating to datetime based versioning system.
I think codecov.yaml
is no longer being respected.
https://github.com/abhinavsingh/proxy.py/blob/develop/codecov.yml
Because we have configured codecov
to allow 1%
diff in coverage without breaking the CI. But looks like codecov
is now always unhappy even with 0.06%
decrease in coverage.
IIRC presence of the file is enough for codecov
to pick this up. Or has codecov
v2 action changed something. Will need to dig into it to confirm anything. Unsure why has it gone broke.
Re:codecov — AFAIK the action only uploads the reports but their backend reads the config separately from that via the GitHub App you have installed. I don't see the connection to my PR. Besides, that change was necessary since they are sunsetting their previous uploader soon and the migration is inevitable.
re:pylint — it is totally expected that it consumes more resources. By nature, it also has checks evaluating the code by dynamically importing it. This is why it's capable of revealing deeper problems.
As for burning through the GHA minutes, I doubt it's a problem. I use GHA in many places with a lot more jobs running and haven't been able to exceed the free limit. Besides, it's not a bottleneck of the GHA setup anyway: macOS and Windows workers consume the minutes with multipliers of 2x and 10x respectively. They have a far greater influence on the resource quota, especially taking into account that there's a matrix of them vs one small linting job.
FWIW I've also prepared a PR logging the execution time of each check:
I can also offer to create a fast linting tox env that would skip pylint, if that helps. You'd get those results from the CIs anyway. Plus I haven't enabled the caching of pre-commit's envs yet, which would probably cut some seconds from the linting run.
re:mypy — the pre-commit tool runs the checks in subprocesses, one after another. I don't see how a mypy run would influence pylint. But since pylint doesn't stream any logs until it finishes its execution, I would imagine that it may subjectively feel like it's taking longer. I think that the PR above will make the actual execution time clearer. Maybe mypy itself is slower when it sees errors, I don't know. But we'll be able to see it once the PR is merged. If it indeed influences the other process run, the only thing I'd suspect would be mac's process prioritization which I don't really know much about but I think it gives more priority to the graphical subsystem and could throttle CPU-intensive processes; since both of these are subprocesses of pre-commit, maybe it sees that one subprocess consumed a lot of resources and when the next one spawns it decides to limit its CPU time. But it's all a speculation at this point and I don't have tools to verify whether it's the case.
P.S. You can pass args to tox commands if they have {posargs}
. With lint, you could do tox -e lint -- mypy --all-files
to just run this one check and then compare the time it takes to complete (otherwise, use pre-commit run mypy --all-files
to run it without a wrapper).
Amazingly tox -e lint -- mypy --all-files
finished immediately. That's surprising. I have never seen it finish so fast when running with the entire suite.
$ time tox -e lint -- mypy --all-files ─╯
lint installed: astroid==2.8.4,attrs==21.2.0,backports.entry-points-selectable==1.1.0,bcrypt==3.2.0,cffi==1.15.0,cfgv==3.3.1,cryptography==35.0.0,distlib==0.3.3,filelock==3.3.2,identify==2.3.3,iniconfig==1.1.1,isort==5.10.0,lazy-object-proxy==1.6.0,mccabe==0.6.1,nodeenv==1.6.0,packaging==21.2,paramiko==2.8.0,platformdirs==2.4.0,pluggy==1.0.0,pre-commit==2.15.0,py==1.11.0,pycparser==2.20,pylint==2.11.1,pylint-pytest==1.0.3,PyNaCl==1.4.0,pyparsing==2.4.7,pytest==6.2.5,PyYAML==6.0,six==1.16.0,toml==0.10.2,types-cryptography==3.3.8,types-enum34==1.1.1,types-ipaddress==1.0.1,types-paramiko==2.7.3,virtualenv==20.10.0,wrapt==1.13.3
lint run-test-pre: PYTHONHASHSEED='905921385'
lint run-test: commands[0] | /Users/abhinavsingh/Dev/proxy.py/.tox/lint/bin/python -m pre_commit run --show-diff-on-failure --hook-stage manual mypy --all-files
mypy.....................................................................Passed
lint run-test: commands[1] | -/Users/abhinavsingh/Dev/proxy.py/.tox/lint/bin/python -c 'cmd = "/Users/abhinavsingh/Dev/proxy.py/.tox/lint/bin/python -m pre_commit install"; scr_width = len(cmd) + 10; sep = "=" * scr_width; cmd_str = " $ {cmd}";' 'print(f"\n{sep}\nTo install pre-commit hooks into the Git repo, run:\n\n{cmd_str}\n\n{sep}\n")'
______________________________________________ summary _______________________________________________
lint: commands succeeded
congratulations :)
tox -e lint -- mypy --all-files 0.97s user 0.25s system 98% cpu 1.242 total
flake8
timing information is similar when run in isolation
$ time tox -e lint -- flake8 --all-files ─╯
lint installed: astroid==2.8.4,attrs==21.2.0,backports.entry-points-selectable==1.1.0,bcrypt==3.2.0,cffi==1.15.0,cfgv==3.3.1,cryptography==35.0.0,distlib==0.3.3,filelock==3.3.2,identify==2.3.3,iniconfig==1.1.1,isort==5.10.0,lazy-object-proxy==1.6.0,mccabe==0.6.1,nodeenv==1.6.0,packaging==21.2,paramiko==2.8.0,platformdirs==2.4.0,pluggy==1.0.0,pre-commit==2.15.0,py==1.11.0,pycparser==2.20,pylint==2.11.1,pylint-pytest==1.0.3,PyNaCl==1.4.0,pyparsing==2.4.7,pytest==6.2.5,PyYAML==6.0,six==1.16.0,toml==0.10.2,types-cryptography==3.3.8,types-enum34==1.1.1,types-ipaddress==1.0.1,types-paramiko==2.7.3,virtualenv==20.10.0,wrapt==1.13.3
lint run-test-pre: PYTHONHASHSEED='306389043'
lint run-test: commands[0] | /Users/abhinavsingh/Dev/proxy.py/.tox/lint/bin/python -m pre_commit run --show-diff-on-failure --hook-stage manual flake8 --all-files
flake8...................................................................Passed
lint run-test: commands[1] | -/Users/abhinavsingh/Dev/proxy.py/.tox/lint/bin/python -c 'cmd = "/Users/abhinavsingh/Dev/proxy.py/.tox/lint/bin/python -m pre_commit install"; scr_width = len(cmd) + 10; sep = "=" * scr_width; cmd_str = " $ {cmd}";' 'print(f"\n{sep}\nTo install pre-commit hooks into the Git repo, run:\n\n{cmd_str}\n\n{sep}\n")'
________________________________________________________ summary _________________________________________________________
lint: commands succeeded
congratulations :)
tox -e lint -- flake8 --all-files 18.52s user 0.45s system 99% cpu 19.022 total
pylint
the worst of all even when run in isolation. Look at the 1445%
CPU and it takes 32.19s
for total execution.
time tox -e lint -- pylint --all-files ─╯
lint installed: astroid==2.8.4,attrs==21.2.0,backports.entry-points-selectable==1.1.0,bcrypt==3.2.0,cffi==1.15.0,cfgv==3.3.1,cryptography==35.0.0,distlib==0.3.3,filelock==3.3.2,identify==2.3.3,iniconfig==1.1.1,isort==5.10.0,lazy-object-proxy==1.6.0,mccabe==0.6.1,nodeenv==1.6.0,packaging==21.2,paramiko==2.8.0,platformdirs==2.4.0,pluggy==1.0.0,pre-commit==2.15.0,py==1.11.0,pycparser==2.20,pylint==2.11.1,pylint-pytest==1.0.3,PyNaCl==1.4.0,pyparsing==2.4.7,pytest==6.2.5,PyYAML==6.0,six==1.16.0,toml==0.10.2,types-cryptography==3.3.8,types-enum34==1.1.1,types-ipaddress==1.0.1,types-paramiko==2.7.3,virtualenv==20.10.0,wrapt==1.13.3
lint run-test-pre: PYTHONHASHSEED='3638090282'
lint run-test: commands[0] | /Users/abhinavsingh/Dev/proxy.py/.tox/lint/bin/python -m pre_commit run --show-diff-on-failure --hook-stage manual pylint --all-files
PyLint...................................................................Passed
lint run-test: commands[1] | -/Users/abhinavsingh/Dev/proxy.py/.tox/lint/bin/python -c 'cmd = "/Users/abhinavsingh/Dev/proxy.py/.tox/lint/bin/python -m pre_commit install"; scr_width = len(cmd) + 10; sep = "=" * scr_width; cmd_str = " $ {cmd}";' 'print(f"\n{sep}\nTo install pre-commit hooks into the Git repo, run:\n\n{cmd_str}\n\n{sep}\n")'
________________________________________________________ summary _________________________________________________________
lint: commands succeeded
congratulations :)
tox -e lint -- pylint --all-files 748.66s user 32.19s system 1445% cpu 54.020 total
Lately I am getting a lot of these coverage errors locally. Behavior is also kind of weird.
.... this behavior is consistent
Made some changes to the workflow files. We might have to pipeline them into the check
step for a green CI.
Made some changes to the workflow files. We might have to pipeline them into the
check
step for a green CI.
At some point we may even have to abstract out linter steps, so that they can run when a matching file changes. After #699 no workflow will run for a README.md
only change, but we might still want to run the markdown
linter.
I see that currently
setup.py
is being used and called directly. This is highly discouraged and is deprecated. See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for more details.It's quite easy to upgrade it to use PEP 517 with invocations via pypa/pip and pypa/build.
If you're open to improvements, I could try to find some time to contribute some packaging and CI/CD-related PRs.