apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.85k stars 4.25k forks source link

Add pyright to CI #20825

Open damccorm opened 2 years ago

damccorm commented 2 years ago

Since for Python besides mypy the other big type checker is pyright (used by Visual Studio Code), add it to CI to ensure code appeases both checkers.

Might find things that mypy overlooked.

Imported from Jira BEAM-12004. Original Jira may contain additional context. Reported by: abergmeier.

lazarillo commented 11 months ago

Any update on this? Or any suggestions on workarounds? I can just tell pyright (or pylance in my case, but it's built on pyright) to ignore all of the mistakes, but that doesn't seem ideal.

FWIW, I find errors with PValue and PCollection happening a lot. For example:

class IngestPubSub(PTransform):
    def expand(self, pcoll: PCollection[bytes]):
        dict_data: PCollection[dict] = pcoll | "parse json strings" >> FlatMap(
            json_load
        ).with_output_types(dict)
        obj_data: PCollection[Person] = dict_data | "create Person objects" >> FlatMap(
            object_load, container=Person
        ).with_output_types(Person)

where Person is a simple dataclass. This runs fine, and there are no errors if I remove the type hints. The type hints send the flags. But I want them for clarity of the code base.

(For those who are curious, I use FlatMap instead of Map because I am allowing for errors in parsing, which means I am either returning a list of length 1 with the proper object, or an empty list. If I turn that to Map, then I have to allow for returning None, which AFAICT breaks the pipeline... it cannot receive a None object.)

damccorm commented 11 months ago

I don't think anyone is looking at this right now, and its unfortunately one of the lower priority items in the issue queue at the moment. Someone may be interested in picking it up, but there is no planned work as of now. Tagging @abergmeier (the original reporter) in case they have any suggestions or workarounds