nipype / pydra

Pydra Dataflow Engine
https://nipype.github.io/pydra/
Other
119 stars 57 forks source link

Fix DaskWorker: `AttributeError: 'tuple' object has no attribute '_run'` #673

Closed adi611 closed 10 months ago

adi611 commented 12 months ago

Types of changes

Summary

This is a fix to DaskWorker which previously failed on some tests with the following error: AttributeError: 'tuple' object has no attribute '_run'. (Refer discussion #665).

Checklist

codecov[bot] commented 12 months ago

Codecov Report

Patch coverage has no change and project coverage change: -0.11% :warning:

Comparison is base (29d3d1f) 82.88% compared to head (5955907) 82.77%.

:exclamation: Current head 5955907 differs from pull request most recent head f0fc433. Consider uploading reports for the commit f0fc433 to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #673 +/- ## ========================================== - Coverage 82.88% 82.77% -0.11% ========================================== Files 22 22 Lines 4845 4849 +4 Branches 1391 0 -1391 ========================================== - Hits 4016 4014 -2 - Misses 825 835 +10 + Partials 4 0 -4 ``` | Flag | Coverage Δ | | |---|---|---| | unittests | `82.77% <0.00%> (-0.11%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nipype#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files Changed](https://app.codecov.io/gh/nipype/pydra/pull/673?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nipype) | Coverage Δ | | |---|---|---| | [pydra/engine/workers.py](https://app.codecov.io/gh/nipype/pydra/pull/673?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nipype#diff-cHlkcmEvZW5naW5lL3dvcmtlcnMucHk=) | `19.11% <0.00%> (-0.17%)` | :arrow_down: | ... and [3 files with indirect coverage changes](https://app.codecov.io/gh/nipype/pydra/pull/673/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=nipype)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

djarecka commented 11 months ago

@adi611 - thank you for this change, but I understand that dask is still not tested in CI, so we should add it and fix the tests

adi611 commented 11 months ago

@djarecka - I tried running the tests after commenting the test test_wf_3nd_ndst_1 in test_workflow.py and no other test is either failing or getting stuck. So the only test which gets stuck/hangs is test_wf_3nd_ndst_1 in test_workflow.py.

djarecka commented 11 months ago

we still have to understand why this specific test is failing

adi611 commented 11 months ago

The tests test_duplicate_input_on_split_wf and test_inner_outer_wf_duplicate in test_workflow.py are failing after reaching the timeout for Python versions 3.9 and 3.10. They also fail for Python version 3.8, but as it is an older version, I am currently disregarding it. These tests are not Dask tests; instead, they run on the cf plugin. It seems to be a GA specific issue since the tests run successfully locally and on Google Colab. Please let me know if I should consider increasing the timeout limit for these tests or if there are any other possible solutions.

This also indicates that all the Dask tests pass for Python versions 3.9 and 3.10.

djarecka commented 11 months ago

for some reason i'm having problems with seeing reports from the tests right now..:( What do you mean this tests are not dask tests, I'm guessing that since it only happens with the "dask" run, some tests are stuck when running the dask plugin, am I right?

Do they have a reasonable running time on google collab?

adi611 commented 11 months ago

Sorry for the confusion. The tests test_duplicate_input_on_split_wf and test_inner_outer_wf_duplicate in test_workflow.py are the only failing tests for Python 3.9 and 3.10. These tests run on cf only, using plugin="cf". The tests which run on both dask and cf have plugin=plugin_dask_opt which is not the case here. So what I meant was the reason these tests are failing may not be a dask related one.

Moreover these tests are passing on Google Colab, as can be seen here: github gist for python 3.10, github gist for python 3.9.

djarecka commented 11 months ago

@adi611 - it looks like the change in timeout didn't help, and I actually check one of the GA report (for 3.9), and I can see that there is an issue with another test, with test_wf_3nd_st_1 that doesn't have timeout, but it fails (even with rerun). It also fails on my laptop...

Are you really able to run without any issues this test locally?

adi611 commented 11 months ago

@djarecka - The test test_wf_3nd_st_1 was not failing before increasing the timeout. In fact, for my fork of the repo the tests actually pass for Python 3.9, and while they do fail for Python 3.10 , it is only for the tests test_duplicate_input_on_split_wf and test_inner_outer_wf_duplicate. It can be seen here.

Even though the results appear to be inconsistent across platforms, running the test files individually does not result to any error in any platform. I have created a new GA workflow to verify the same. This will be helpful in understanding the cause of the problem.

djarecka commented 11 months ago

ok, thank you. I've approved the workflows, so should run now

ghisvail commented 10 months ago

Superseded by #686