SrinivasMushnoori / repex

An implementation of the RepEx package as an application written in the EnTK API
MIT License
2 stars 2 forks source link

Asynchronous exchange: Wait until replica_waiting_list is large enough? #26

Closed SrinivasMushnoori closed 5 years ago

SrinivasMushnoori commented 5 years ago

This line contains a comment that describes the issue.

Instead of waiting, the sleep function causes the entire program to sleep and makes it so that the replica_waiting_list is never actually populated. Issue has been blocking for about 3 days.

@bennybp @doaa-altarawy Assistance is appreciated. Thanks. Happy to provide clarifications if needed.

SrinivasMushnoori commented 5 years ago

@vivek-bala I think we might have discussed this way of triggering adaptivity in each replica pipeline before. Any comments?

SrinivasMushnoori commented 5 years ago

Upon discussion with @vivek-bala we have concluded that some changes might need to be made in how the adaptivity is set up. This is currently being worked upon.

SrinivasMushnoori commented 5 years ago

Update: Feature request in place. https://github.com/radical-cybertools/radical.entk/blob/feature/repex_async/examples/async_repex.py

@vivek-bala correct me if I'm wrong: We need a way to stall the pipeline without stalling the master function. The attempt being made here is a pipeline.rerun() method that does not "stall" the pipeline but stops it entirely, but, in a sense, checkpoints it. That way when the replica pipeline receives new tasks (either MD or exchange) it "restarts", but only executes tasks it receives after the checkpoint.

What happens here is that we're not stalling anything, just ending and restarting pipelines from where they left off.

SrinivasMushnoori commented 5 years ago

Code to current Async implementation: https://github.com/SrinivasMushnoori/repex/blob/devel/misc/experimental_async/experimental_async.py

SrinivasMushnoori commented 5 years ago

"Sliding Window" implemented here.

This seems to set up replica pipelines fine but does not spawn the exchange task for some reason? Still digging into it. May be because of EnTK issue.

SrinivasMushnoori commented 5 years ago

Closing this ticket, the answer to the original question is a "yes." Implementation issues are being discussed in ticket #29.