Open jaraco opened 1 year ago
I'm happy to accept contributions! I tried to find a popularity ranking of all Python packages, but I didn't succeed. I did find snyk Advisor provides a quick and easy way to find packages by first letter (https://snyk.io/advisor/packages/python/j). You could go through the list of packages there and look at the popularity of each. It also sounds as if the pypi downloads tables are available to be queried (https://pypistats.org/api/#etiquette), so you could probably construct a query to do that. You might also search Github for projects with the skeleton badge - that's a pretty good indicator that it's following my best practices.
tox
will cause the type checks to be run via pytest-mypy
. Running tox -- -k mypy
will run mainly just the mypy tests.-> None
for two-line private methods with no return
). Try to focus on adding type hints that add value - that communicate something additional and non-obvious. I won't reject a PR based on this condition, but it'll be more easily accepted if it's clearly adding value and not lint.I am used to type checking my code with mypy in strict mode. While some may see the strict mode as too aggressive, I will have no problem adjusting a bigger codebase to pass strict type checking. This is what I had in mind during writing the message.
I'm not a fan of type hinting noise (e.g. -> None for two-line private methods with no return). Try to focus on adding type hints that add value - that communicate something additional and non-obvious. I won't reject a PR based on this condition, but it'll be more easily accepted if it's clearly adding value and not lint.
mypy disallows untyped definitions in the strict mode. A private function without the return
statement is still assumed to return Any
, which is untrue, since None
is the desired return type:
class Foo:
def _bar(self): # seen by mypy as untyped
pass
reveal_type(Foo()._bar()) # Revealed type is "Any"
class Spam:
def _eggs(self, biz: int): # seen as typed, return assumed to be Any
pass
reveal_type(Spam()._eggs(5)) # Revealed type is "Any"
[!Note] The only exceptions from the behavior above are
__init__
and__init_subclass__
methods where-> None
can be omitted, provided the function has at least one type hint in the signature.Relevant resources:
While no presence of the return
statement in a method may be an obvious sign that the return type of that method does not matter, which is a logical reason not to leave a type hint there, even a return None
statement with an unequivocal implication of the None
return type is still seen as Any
by mypy:
class Example:
def _com(self, port: int):
return None
reveal_type(Example()._com(43)) # Revealed type is "Any"
In places where the value of such a method is being used, the type hinting noise is then simply necessary to inform mypy about the actual return type that affects other scopes of the codebase.
I firmly believe that the best option would be being as explicit as possible when annotating types, because, as it seems, there are too few obvious cases at the end of the day.
The Zen of Python, by Tim Peters:
Beautiful is better than ugly. Explicit is better than implicit.
Note: The examples were tested on mypy 1.7.0.
I'm happy to accept contributions! I tried to find a popularity ranking of all Python packages, but I didn't succeed. I did find snyk Advisor provides a quick and easy way to find packages by first letter (snyk.io/advisor/packages/python/j). You could go through the list of packages there and look at the popularity of each. It also sounds as if the pypi downloads tables are available to be queried (pypistats.org/api/#etiquette), so you could probably construct a query to do that. You might also search Github for projects with the skeleton badge - that's a pretty good indicator that it's following my best practices.
Thank you for your time spent on the research. I will use these resources and compile a TO-DO list of the projects to work on in this issue. Every subsequent PR will reference this issue.
I firmly believe that the best option would be being as explicit as possible when annotating types, because, as it seems, there are too few obvious cases at the end of the day.
Of course if the new type hints would appear too intricate for the eye, which I can totally understand, there is always an option of creating stubs to isolate two worlds of the implementation and the type hints, like in https://github.com/jaraco/jaraco.functools/pull/22. What are your thoughts on this @jaraco?
I wrote a script that extracted all jaraco projects from PyPI and sorted them by the total of downloads in the last month. The last two projectsβDistutils and backportsβhave 0 downloads only because checking pypi stats on them causes an HTTP 404 error. I don't think inspecting that issue is necessary.
[!Note] Skeleton badges come from the PyPI latest releases, not the projects' repositories. As a result, these badges are valid indicators of the years all the relevant latest PyPI releases took place.
The script produced the following table:
In order to make tracking the progress significantly easier, I've generated the following checklist-form roadmap.
@jaraco, could you please let me know if you want to apply some additional filtering/sorting to the roadmap above?
An interesting approach would be to measureβfor every projectβhow many PyPI projects depend on it, using libraries.io. But I think it would more or less correlate with downloads/month anyway, as I am assuming (without having done any research) that most of these downloads come from pipelines that install every dependency for the first time, and dependencies referencing the same project being together aren't that common (so like, when for two projects A and B sharing the same project X as a dependency, X in fact gets installed "twice less", because only once, as it is reused in the same environment by A and B, even though I would technically call X more popular in this case). This can get even more interesting if we add distinction between pip-like and pipx-like installation methods, where a shared dependency would be installed twice for separate environments of A and B, assuming A and B are CLI applications... I could be pointing out all the things that come to my mind further, and it gets very complicated and multilayered as I dig in the rabbit hole.
Yeah, so all in all I've learned that "package popularity" isn't trivial, because both the convergence in the web of dependencies as well as the statistics of total downloads in time play a role in the correct evaluation of how popular a package really is.
Anyway though, this is just an attempt to prioritize the tasks. Since some most popular jaraco projects commonly depend on some less popular jaraco projects (downloads/month-wise!), I started off from jaraco.functools and jaraco.classes.
Amazing analysis. Thanks!
You'll notice that libraries.io is a project by Tidelift. You may have noticed that I also work with them as they work to garner support for open source maintainers from the enterprise users. They've probably built other tooling and may even be interested in collaborating on tooling to support open source maintenance. You may want to consider signing up with them as a maintainer (there's no cost and could potentially pay) and engaging on the forums to see if there is interest in collaborating. If you need a referral or anything to get signed up, let me know.
You may want to consider signing up with them as a maintainer
Thank you! I've just applied to lift a few of my projects!
Due to a considerable number of projects that need similar work, I created a project that aims to automate the whole process as much as possibleβautorefine.
MonkeyType will turn out very handy when it comes to type hints generation. I will take care of making them as sophisticated as needed. I think all these projects have enough coverage, so I will simply generate the types by running tests.
I will leverage LibCST and create custom rules if needed for modernizing the projects (some had their last releases a few years ago)βbut this is out of scope at the moment.
Contributions & suggestions very, very welcomeβI am learning.
Hopefully the tool will speed up more_itertools.consume(map(functools.partial(refine, scope="typing"), jaraco_projects))
.
I've made a Coherent OSS project for the initiative: https://github.com/orgs/coherent-oss/projects/3/views/2
I firmly believe that the best option would be being as explicit as possible when annotating types, because, as it seems, there are too few obvious cases at the end of the day.
Of course if the new type hints would appear too intricate for the eye, which I can totally understand, there is always an option of creating stubs to isolate two worlds of the implementation and the type hints, like in jaraco/jaraco.functools#22. What are your thoughts on this @jaraco?
I guess that's fine. I should probably get used to Python being more verbose and less essential.
I firmly believe that the best option would be being as explicit as possible when annotating types, because, as it seems, there are too few obvious cases at the end of the day.
Of course if the new type hints would appear too intricate for the eye, which I can totally understand, there is always an option of creating stubs to isolate two worlds of the implementation and the type hints, like in jaraco/jaraco.functools#22. What are your thoughts on this @jaraco?
I guess that's fine. I should probably get used to Python being more verbose and less essential.
Well, I guess it's just that Python wasn't built for being statically typed. π€·ββοΈ The good news is that due to its continuous development, the unfortunate effect of verbosity over essentialness in typing slowly decreases: take PEP 695 as an example.
As more skeleton-based projects get typed (with a py.typed
marker), I'd recommend requesting them to https://github.com/hauntsaninja/mypy_primer/blob/master/mypy_primer/projects.py . Especially for projects that are widely used in the ecosystem.
I received this inquiry in discord: