pyiron / pyiron_workflow

Graph-and-node based workflows
BSD 3-Clause "New" or "Revised" License
10 stars 1 forks source link

JNmpi patch main head #240

Closed liamhuber closed 2 months ago

review-notebook-app[bot] commented 3 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

github-actions[bot] commented 3 months ago

Binder :point_left: Launch a binder notebook on branch _pyiron/pyiron_workflow/JNmpi_patch_mainhead

codacy-production[bot] commented 3 months ago

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation Diff coverage
Report missing for 9d338153c8f9cafac2a93f2ac01f61c6bc26a7bb[^1] :white_check_mark: 88.44%
Coverage variation details | | Coverable lines | Covered lines | Coverage | | ------------- | ------------- | ------------- | ------------- | | Common ancestor commit (9d338153c8f9cafac2a93f2ac01f61c6bc26a7bb) | Report Missing | Report Missing | Report Missing | | | Head commit (04536bee844f9a88fb51202a4d6455336a751e77) | 4367 | 2975 | 68.12% | **Coverage variation** is the difference between the coverage for the head and common ancestor commits of the pull request branch: ` - `
Diff coverage details | | Coverable lines | Covered lines | Diff coverage | | ------------- | ------------- | ------------- | ------------- | | Pull request (#240) | 1116 | 987 | **88.44%** | **Diff coverage** is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: `/ * 100%`

See your quality gate settings    Change summary preferences

You may notice some variations in coverage metrics with the latest Coverage engine update. For more details, visit the documentation [^1]: Codacy didn't receive coverage data for the commit, or there was an error processing the received data. Check your integration for errors and validate that your coverage setup is correct.

liamhuber commented 3 months ago

Node.iter is not playing nicely with Node.__post__. As far as I can tell, iter is shipping things off to an executor which is triggering the post call multiple times for the same node; if I understanding this is giving a race condition for the different threads that results in an error if one of them manages to call the final self.graph_root.tidy_working_directory() line of __post__ before they've all finished the earlier self.storage.has_contents check.

There might be some workaround, but I guess this is a fundamental incompatibility between the assumptions of the storage branch -- i.e., that each node is unique and has full sovereignty over its file and semantic path -- and iter -- which is re-instantiating new nodes with all the same kwargs and running them in parallel. Naturally the multiple copies fight with each other over the file path, since they all have the same name.

liamhuber commented 2 months ago

I'll introduce compatible for-loop and a dataclass node before trying to merge these.