ros2 / launch

Tools for launching multiple processes and for writing tests involving multiple processes.
Apache License 2.0
124 stars 139 forks source link

Improve type checking #679

Closed haudren-woven closed 1 year ago

haudren-woven commented 1 year ago

This is still a work in progress, but following @clalancette 's comments on #676 , I made this PR to:

Note that we will probably need to tweak the mypy configuration a little bit, since currently non-typed imports are treated as errors.

haudren-woven commented 1 year ago

The test indeed failed, and the run time is indeed around 1 or 2 seconds:

10:30:23 test/test_flake8.py .                                                    [  1%]
10:30:31 test/test_mypy.py F                                                      [  2%]
10:30:32 test/test_pep257.py .                                                    [  2%]
methylDragon commented 1 year ago

If the test is failing, could you set this PR to draft until you have a working version?

Edit: I'm not sure how possible it would be to introduce a test that would fail on merge... It looks like whatever's causing the type errors to happen need to be fixed alongside this PR

haudren-woven commented 1 year ago

@methylDragon : Sorry about that, I should have made it draft. I just wanted to address @clalancette 's concerns quickly. I will convert to draft, and try to fix the type issues, please hold on :pray:

haudren-woven commented 1 year ago

@methylDragon : First of all sorry for the long delay. I got really busy in work stuff, then went on holidays. I'm resuming work on this, and one sticky method has been the Entity.get_attr. Its behavior (with regards to typing) is a little hard to capture:

Thank you!

haudren-woven commented 1 year ago

Another few questions:

  1. Surrounding OnProcessExit: in that file, a number of cast are performed to basically perform an upcast from functions taking the narrow type ExecuteLocal into more general functions taking the abstract Action type. I don't quite understand how this can work since OnActionEventBase seems to call the action matcher callable with the action itself, without checking if the action that generated the event is indeed a subclass of target_action_cls.
  2. For the OpaqueCoroutine, there seems to be a great deal of confusion between a coroutine and a coroutine function. As far as I can tell, the documentation (and the type annotations) seem to think this foo object is a coroutine:

        async def foo() -> None:
           pass

    However, in this case foo is a coroutine function, e.g. a Callable which returns a Coroutine. I can fixup the type annotations fairly easily, but that issue is present in a few places, so I'd like to hear everyone's opinion on the best course here.

  3. I was quite surprised to see this code in some_actions_type:
    SomeActionsType = Union[
        LaunchDescriptionEntity,
        Iterable[LaunchDescriptionEntity],
    ]

    This causes issues in some places where the return type is still Iterable[Action], which is incompatible with Iterable[LaunchDescriptionEntity]. I guess this is the result of some refactoring, but how should we address the situation: keep the name SomeActionsType or rename it to SomeEntitiesType?

haudren-woven commented 1 year ago

@adityapande-1995 @methylDragon @wjwwood @clalancette : any comments / remarks on the above points?

haudren-woven commented 1 year ago

@adityapande-1995 @methylDragon @wjwwood @clalancette : Sorry to bother everyone, but could someone take a quick look at the above remarks? Then I can resume work on this PR

methylDragon commented 1 year ago

Hey @haudren-woven , sorry for dropping this, and thanks again for all the hard work!

Unfortunately, I don't think I have the necessary context/experience with the launch frontend to give definitive answers for this, but I can give it a shot... (though I think @wjwwood would have more of the necessary context)

Data Type

  • data_type in the original signature can be only one of AllowedTypesType, which is pretty much either a scalar or a list of scalars. However, in the code it's very frequently used with a List[Entity]. Could you maybe shed some light on what is the real type bound on the data types of get_attr?

I think in this case, let's expand the list of allowable types to what's actually being used in source.

Consider the usage of data_type for get_attrs for the keep attribute in this xml example (since keep attribute access is one of the cases where the entity returns a list of entities):

        <launch>
            <let name="foo" value="FOO"/>
            <let name="bar" value="BAR"/>
            <reset>
                <keep name="bar" value="$(var bar)"/>
                <keep name="baz" value="BAZ"/>
            </reset>
        </launch>

If either bar or baz were not strings, then to get all the keep children of the reset entity, the get_attrs method would have to be able to return a heterogeneous list of children.

So in this case I think it would be reasonable to allow for List[Entity] to also be a valid data_type. Let's then set data_type as Union[AllowedTypesType, List[Entity]], I think.

can_be_str

  • can_be_str is not documented in the abstract class. I understand well that in the case of a scalar, it means that the return type would be a Union[Scalar, str], but what about a list? Can only the individual elements of the list be left as string (e.g. List[Union[Scalar, str]]) or is it the whole list (e.g. Union[List[Scalar], str]).

Consider the other available docstrings for can_be_str in the other methods:

In the case a value can be either an instance of a type or a substitution, the can_be_str argument of get_attr must be used :param can_be_str: when True, non-uniform lists mixed with strings are allowed.

I think in no cases do we want to convert a single list entity into a string in its entirety. So I think we want List[Union[Scalar, str]].

The reason for this is to think about how the Entities are expected to be used (parsed entities to be manipulated programatically). If we convert a list entity into a string, then it'd require a re-parse, which I don't think is what the intended use is. I might be wrong about this though.

Also: Ideally we'd end up documenting the can_be_str attribute for the abstract class, but I'm not sure if my understanding is correct. I'd defer to @wjwwood for this, but in case this needs to be merged now, I think it'd be wise to add a note/todo, while adding the current understanding of the usage. (Where the Entity can be either a string or a scalar, in the case of a list of entities, a heterogeneous list of scalars/entities AND/OR strings.)

optional

  • In terms of interaction between can_be_str and optional, I'd like to make sure that my understanding is correct: when both are set to True, is it fair to say that we will return None if the attribute is absent, a string if the attribute is present but cannot be coerced to the correct type, the type otherwise?

I think this is correct.


I will respond to the later post soon.

methylDragon commented 1 year ago

OnProcessExit

Surrounding OnProcessExit: in that file, a number of cast are performed to basically perform an upcast from functions taking the narrow type ExecuteLocal into more general functions taking the abstract Action type. I don't quite understand how this can work since OnActionEventBase seems to call the action matcher callable with the action itself, without checking if the action that generated the event is indeed a subclass of target_action_cls.

That... sounds like a subtle type bug. I'm not sure what way to best resolve it.

Coroutines

For the OpaqueCoroutine, there seems to be a great deal of confusion between a coroutine and a coroutine function. As far as I can tell, the documentation (and the type annotations) seem to think this foo object is a coroutine:

    async def foo() -> None:
       pass

However, in this case foo is a coroutine function, e.g. a Callable which returns a Coroutine. I can fixup the type annotations fairly easily, but that issue is present in a few places, so I'd like to hear everyone's opinion on the best course here.

Again, I'm not sure I have all the context, but I'd be in favor of fixing the docs/annotations to match how each function is currently being used (especially if it's a widespread pattern.)

some_actions_type

I was quite surprised to see this code in some_actions_type:

SomeActionsType = Union[
    LaunchDescriptionEntity,
    Iterable[LaunchDescriptionEntity],
]

This causes issues in some places where the return type is still Iterable[Action], which is incompatible with Iterable[LaunchDescriptionEntity]. I guess this is the result of some refactoring, but how should we address the situation: keep the name SomeActionsType or rename it to SomeEntitiesType?

I would be in favor of a rename (and a file move as well.)

haudren-woven commented 1 year ago

Thank you very much for the thoughtful answers! I will proceed with these, but it might take a little while... Please hold on :pray:

methylDragon commented 1 year ago

Thank you very much for the thoughtful answers! I will proceed with these, but it might take a little while... Please hold on 🙏

Heyyy no worries about this! Take your time! Feel free to ping me again if I don't do timely replies, and I'll help best I can. Though I'm not holding all of the context

haudren-woven commented 1 year ago

@methylDragon : The tests passed :face_exhaling: That was a bit harder that I expected but at least now a lot of things are checked :partying_face: I think the annotations can be improved over time, as there are a number of places that are probably too permissive, but somehow mypy is happy (and found some bugs I inadvertently introduce). I think however that it's a good place to start, and we can progressively improve the type checking in the future.

methylDragon commented 1 year ago
haudren-woven commented 1 year ago

@methylDragon : The tests are failing in dependent packages due to being unable to import SomeActionsType which I have renamed to SomeEntitiesType. One probable culprit is in ros2/launch_ros : https://github.com/ros2/launch_ros/blob/6daacbce4bade7ed40f86f16a30a08b4d7ee9272/launch_ros/launch_ros/event_handlers/on_state_transition.py#L23 .

How would you like to handle the change? I was thinking of reintroducing some_actions_type but with a deprecation warning?

methylDragon commented 1 year ago

@methylDragon : The tests are failing in dependent packages due to being unable to import SomeActionsType which I have renamed to SomeEntitiesType. One probable culprit is in ros2/launch_ros : https://github.com/ros2/launch_ros/blob/6daacbce4bade7ed40f86f16a30a08b4d7ee9272/launch_ros/launch_ros/event_handlers/on_state_transition.py#L23 .

How would you like to handle the change? I was thinking of reintroducing some_actions_type but with a deprecation warning?

I think doing a tick-tock with a deprecation warning makes sense, yep!

haudren-woven commented 1 year ago

@methylDragon : I added back the SomeActionsType which will hopefully fix the issues in the underlying packages :crossed_fingers:

methylDragon commented 1 year ago
haudren-woven commented 1 year ago

Oh no I had to disable my default formatter (black) to work on this codebase, so I forgot to run the other checks... BRB

haudren-woven commented 1 year ago

I'm looking into how to run the tests, but locally I can't reproduce the CI errors... Despite running it inside a clean ros:humble docker container :thinking: Do you have any tips? colcon test reports success:

test/test_flake8.py . 

But I obviously haven't fixed all the errors :thinking:

haudren-woven commented 1 year ago

Ok got it! I need to install python3-flake8-import-order, which is somehow not a rosdep? Anyhow, that's great :)

haudren-woven commented 1 year ago

@methylDragon 32fa4cc should have fixed the CI errors :)

methylDragon commented 1 year ago

Ok got it! I need to install python3-flake8-import-order, which is somehow not a rosdep? Anyhow, that's great :)

:O

No it's there as a rosdep, maybe you have to update your rosdep

haudren-woven commented 1 year ago

I meant it doesn't seem to be a key inside the package.xml so rosdep install --from-paths src doesn't seem to pick it up :thinking: Not sure how it's installed in CI :thinking:

methylDragon commented 1 year ago

I meant it doesn't seem to be a key inside the package.xml so rosdep install --from-paths src doesn't seem to pick it up 🤔 Not sure how it's installed in CI 🤔

OH! I think you're meant to call ament_flake8 instead of running flake8 manually :>

methylDragon commented 1 year ago
methylDragon commented 1 year ago

Just three more lint errors!

haudren-woven commented 1 year ago

Oh no... I'm sorry I really need to learn how to run the linters locally...

haudren-woven commented 1 year ago

Oh no... I'm sorry I really need to learn how to run the linters locally...

haudren-woven commented 1 year ago

Should be better now :)

methylDragon commented 1 year ago

:crossed_fingers:

haudren-woven commented 1 year ago

Seems to have failed due to random connection dropping...

09:55:12 Caused: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@50a4fcb8:JNLP4-connect connection from ip-10-0-1-61.ec2.internal/10.0.1.61:46536": Remote call on JNLP4-connect connection from ip-10-0-1-61.ec2.internal/10.0.1.61:46536 failed. The channel is closing down or has closed down

methylDragon commented 1 year ago

Aarch: Build Status

I'm not sure why mypy is finding errors on windows: https://ci.ros2.org/job/ci_windows/18774/testReport/junit/launch.test/test_mypy/test_mypy/

haudren-woven commented 1 year ago

I see what's going on:

12:16:04 test\launch\utilities\test_signal_management.py:55:14: error: Module has no attribute "SIGUSR1"
12:16:04 test\launch\utilities\test_signal_management.py:56:22: error: Module has no attribute "SIGUSR2"

This means that these statements are protected by a guard that mypy does not recognize...

haudren-woven commented 1 year ago

@methylDragon : Seems to have passed on my machine using mypy --platform win32. Now time to :pray:

I also noticed that ament_mypy doesn't allow to pass arbitrary mypy command-line flags, I'm wondering if it's something we should add...

methylDragon commented 1 year ago

Windows: Build Status

haudren-woven commented 1 year ago

Hum two test failures on Windows, but they don't seem super related :thinking: I'll have a closer look but are any flaky tests known?

methylDragon commented 1 year ago

No harm in trying again

Build Status

methylDragon commented 1 year ago

Okay, I'm just gonna say that they're unrelated flaky tests that are failing, since its a different set of test failures now :grimacing:

Merging!! Thanks for the incredible work! :rocket: