facebook / pyre-check

Performant type-checking for python.
https://pyre-check.org/
MIT License
6.86k stars 437 forks source link

Update Django .pyi files #413

Open maximmasiutin opened 3 years ago

maximmasiutin commented 3 years ago

I have made a pull request to update the Django .pysa file "django_sources_sinks.pysa" to the current release of Django, since the existing .pysa file gives errors preventing "pyre" from running, see https://github.com/facebook/pyre-check/pull/412

However, there are lots of Django .pyi files at https://github.com/facebook/pyre-check/tree/master/stubs/django made by Facebook which also need to be updated.

What about an automated tool like from "stubgen" from "mypy", and then write a script that modifies .pyi files to insert types instead of "Any", but only those lines that are relevant to "django_sources_sinks.pysa"? I can write such a script if you wish. I'm eager to be able to check Django projects with Pyre, but currently I cannot.

But a better idea is to contribute to https://github.com/django/django to insert types in the Django code directly.

onionymous commented 3 years ago

So this is part of a bigger discussion that we're having internally, but the tl;dr: these type stubs are no longer actively being maintained. In general the focus of Pyre/Pysa is just typechecking and static analysis, and we'd prefer not to maintain type stubs if possible - it's best if existing projects having their own type information, or we rely on typing projects like Typeshed. Since there exists https://github.com/typeddjango/django-stubs, which seems like a far more actively maintained and complete set of type stubs for Django, we're also thinking of switching to use this.

In the meantime, to unblock your efforts to run Pysa on Django projects, you might consider deleting Pyre's folder for Django type stubs completely, and trying to use the stubs at https://github.com/typeddjango/django-stubs instead.

We will discuss internally about what the best way to move forward is, but for now we are hoping to avoid updating these stubs.

maximmasiutin commented 3 years ago

Thank you, I will take the https://github.com/typeddjango/django-stubs file. However, an updated .pysa file for Django is still needed, because in the latest version of Django, "pyra analyze" gives errors unless you update the "django_sources_sinks.pysa" file.

onionymous commented 3 years ago

Yes, that is true. Thank you for the PR - we'll get around to looking at the changes in #412 as soon as possible.

maximmasiutin commented 3 years ago

I'm still not yet very comfortable with testing Django applications with "pyre analyze". Once I manage to configure Pyre to find vulnerabilities in the deliberately vulnerable Django applications, and we will be able to make sure that the https://github.com/typeddjango/django-stubs work with Pyre properly, we should probably delete the folder with "django" .pyi files from the Pyre repository, and instead add a comment to "django_sources_sinks.pysa" and/or somewhere else on how to get the https://github.com/typeddjango/django-stubs and where to put them in order everything worked properly.

maximmasiutin commented 3 years ago

I have found the following issues with https://github.com/typeddjango/django-stubs :

It does not define "cached" properties, while .pysa lists at least one of them: taint/django_sources_sinks.pysa:13:0 django.http.request.HttpRequest.accepted_types is not part of the environment!

It does not define properties starting with "_" (underscore), while such properties are defined in the .pysa file:

taint/django_sources_sinks.pysa:23:0 The modelled function django.http.request.HttpRequest.__repr__ is an imported function, please model object.__repr__ directly. taint/django_sources_sinks.pysa:57:0 django.http.response.HttpResponseBase.__bytes__ is not part of the environment! taint/django_sources_sinks.pysa:64:0 django.http.response.HttpResponse.__bytes__ is not part of the environment! taint/django_sources_sinks.pysa:67:0 The modelled function django.http.response.HttpResponse.__iter__ is an imported function, please model django.http.response.HttpResponseBase.__iter__ directly. taint/django_sources_sinks.pysa:124:0 Class django.contrib.sessions.backends.base.SessionBase has no attribute _session_key.

It can be resolved by removed all properties above mentioned from the .pysa file.

It has a .pyi for django.db.models.query.QuerySet that does not match the actual .py for Django 3.2, which makes Pyre give the following error:

taint/django_sources_sinks.pysa:87:0 The modelled function django.db.models.query.QuerySet.raw is an imported function, please model django.db.models.query._BaseQuerySet.raw directly. taint/django_sources_sinks.pysa:127:0 The modelled function django.db.models.query.QuerySet.get is an imported function, please model django.db.models.query._BaseQuerySet.get directly.

This can be resolved by contributing a correct .pyi file to the django_stubs repository.

maximmasiutin commented 3 years ago

I have made a pull request at https://github.com/typeddjango/django-stubs/pull/598

However, if we would like to use the https://github.com/typeddjango/ exclusively instead of the custom-made stubs for Django, we have to figure out what to do with the underscore functions defined in the “.pysa” file.

gbleaney commented 3 years ago

we have to figure out what to do with the underscore functions defined in the “.pysa” file.

It looks like all of those functions do exist in the Django source code (eg. __repr__). If you add them to typeddjango/django-stubs#598, the _ Pysa models should start passing validation.

maximmasiutin commented 3 years ago

we have to figure out what to do with the underscore functions defined in the “.pysa” file.

It looks like all of those functions do exist in the Django source code (eg. __repr__). If you add them to typeddjango/django-stubs#598, the _ Pysa models should start passing validation.

When I viewed the code at typeddjango/django-stubs, I have figured out that they deliberately don't add _ methods to .pyi files. I may however make a pull request with a few "_" functions, but it is unlikely to be accepted. Is there real value for them in Pyre?

gbleaney commented 3 years ago

Is there real value for them in Pyre?

We need them to be present if we're going to keep those models around. Pysa enforces that models correspond to functions that it knows exist, and if the .pyi file doesn't contain those __ functions, then there's no way for Pysa to know they exist.

but it is unlikely to be accepted

I've subscribed to https://github.com/typeddjango/django-stubs/pull/598 and will happily jump in to help make the case if they don't want to accept them. If they won't accept them, we may have to consider other alternative sources for models (or just keep updating ours indefinitely)

maximmasiutin commented 3 years ago

@gbleaney It is unlikely that they would accept https://github.com/typeddjango/django-stubs/pull/598 because it failed the tests. I have no idea why it failed and how to fix it.

I might have created a second pull request with underscore-prefixed functions, but without the https://github.com/typeddjango/django-stubs/pull/598 Pyre would not work anyway.

gbleaney commented 3 years ago

@maximmasiutin if they don't accept typeddjango/django-stubs#598, I'm OK with updating our Pysa models to match their polite fiction about _BaseQuerySet. If we add a pip dependency of pyre-check on django-stubs, then all users of pyre will get those stubs, and the models will will pass validation.

maximmasiutin commented 3 years ago

we have to figure out what to do with the underscore functions defined in the “.pysa” file.

It looks like all of those functions do exist in the Django source code (eg. __repr__). If you add them to typeddjango/django-stubs#598, the _ Pysa models should start passing validation.

The suppliers of the .pyi files did deliberately not include names starting with underscore, because they are reserved for internal use. So even if I ask to add functions with underscore to the .pyi files, chances are low that they would accept it. This would have encouraged bad practice to use the underscored functions directly, so what is a rationale for them to just add a few for Pyre? They should either be added all, or none.

According to PEP 8 -- Style Guide for Python Code, a name beginning with a single underscore is reserved for internal use:

_single_leading_underscore: weak "internal use" indicator. E.g., from M import * does not import objects whose names start with an underscore.

Modules that are designed for use via from M import * should use the all mechanism to prevent exporting globals, or use the older convention of prefixing such globals with an underscore (which you might want to do to indicate these globals are "module non-public").

Use one leading underscore only for non-public methods and instance variables.

Public attributes should have no leading underscores.

internal interfaces (packages, modules, classes, functions, attributes or other names) should still be prefixed with a single leading underscore.

gbleaney commented 3 years ago

The suppliers of the .pyi files did deliberately not include names starting with underscore, because they are reserved for internal use.

Did the typeddjango maintainers say that somewhere? I couldn't find an example of that.

Double underbar functions are standardized and documented functions, unlike single underbar functions. And there seems to be a decent precedent for including these double underbar (note, not single underbar) methods in type stubs. For example, take a look at the object stub in typeshed:

https://github.com/python/typeshed/blob/16ae4c61201cd8b96b8b22cdfb2ab9e89ba5bcf2/stdlib/%40python2/builtins.pyi#L69 In fact, if you look at the error message you were getting earlier:

taint/django_sources_sinks.pysa:23:0 The modelled function django.http.request.HttpRequest.__repr__ is an imported function, please model object.__repr__ directly.

You can see Pysa is telling you that it didn't find a __repr__ on HttpRequest but it did find one on the object stub we get from typeshed.

maximmasiutin commented 3 years ago

@gbleaney Thank you! I will make a pull request to add methods with double underscores to http request and response classes, and also the _session_key for the SessionBase. The typeddjango/django-stubs maintainers did not say that they deliberately not include functions with single underscores. I did infer that from the maths that I did with grep. The maintainers of typeddjango/django-stubs did define only 9% of functions with single underscore comparing to 100% of other functions, and 27% of member functions (methods) with single underscore comparing to 57% of other member functions.

maximmasiutin commented 3 years ago

I've created a request at https://github.com/typeddjango/django-stubs/pull/601