Closed d33bs closed 6 months ago
there are a couple of things you can look at in logs to see if parsl thinks it is running things concurrently:
i) in runinfo/NNN/parsl.log you should see lines like this:
1713422707.218805 2024-04-18 06:45:07 MainProcess-7339 MainThread-127293881239360 parsl.dataflow.dflow:743 launch_task INFO: Parsl task 0 try 0 launched on executor htex_local with executor id 1
and
1713422713.432368 2024-04-18 06:45:13 MainProcess-7339 HTEX-Queue-Management-Thread-127293158581952 parsl.dataflow.dflow:573 _complete_task INFO: Task 0 completed (launched -> exec_done)
You should see lots of tasks launched before they are completed (aka many tasks are in launched state) - if your workflow code is structured so that there is only one task in launched state at once, with the next task not being launched till the previous one completes, then there is probably a concurrency problem in your workflow code. That's often because of using .result()
inside a loop which blocks the loop from submitting any more tasks.
You can also check that there are a bunch of workers - each worker can run one task at once, so on my laptop where I want 8x concurrency, I see this:
$ less runinfo/000/htex_local/block-0/881e5dba1686/
manager.log worker_0.log worker_1.log worker_2.log worker_3.log worker_4.log worker_5.log worker_6.log worker_7.log
and inside those logs I can see those workers are each processing tasks, like this:
$ $ head runinfo/000/htex_local/block-0/881e5dba1686/worker_3.log
2024-04-18 06:45:13.452 worker_log:673 7387 MainThread [INFO] Worker 3 started
2024-04-18 06:45:13.452 worker_log:675 7387 MainThread [DEBUG] Debug logging enabled
2024-04-18 06:45:13.566 worker_log:758 7387 MainThread [INFO] Received executor task 7
2024-04-18 06:45:13.571 worker_log:774 7387 MainThread [INFO] Completed executor task 7
2024-04-18 06:45:13.572 worker_log:785 7387 MainThread [INFO] All processing finished for executor task 7
2024-04-18 06:45:14.028 worker_log:758 7387 MainThread [INFO] Received executor task 15
2024-04-18 06:45:14.085 worker_log:774 7387 MainThread [INFO] Completed executor task 15
2024-04-18 06:45:14.085 worker_log:785 7387 MainThread [INFO] All processing finished for executor task 15
If you don't see as many workers as you are expecting concurrency (eg. as many workers as you have cores on your runtime environment, if that's what you want) then that is possibly a configuration problem with one of the configuration options limiting the number of workers.
Thanks so much @benclifford ! This ended up being caused by using .result()
within loops when instantiating the tasks. After moving away from this pattern I found that Parsl was able to parallelize as per options from Config
. Please feel free to close when it make sense!
I've been experimenting with various Parsl configurations involving
HighThroughputExecutor
andLocalProvider
executed on a laptop and finding that I'm unable to achieve parallel output. When I observe tests I often look for multiple files being written at once from multiple processes to determine whether things are running in parallel. I'm finding that files are written sequentially, seemingly without any parallelism. Generally I've tried to follow the documentation found under https://parsl.readthedocs.io/en/stable/userguide/execution.html#configuration . I think I might be doing something wrong or misinterpreting the results and was hoping to gain more insights by opening this issue.This is a link to a simple example which involves exporting data to Parquet from a SQLite file: Google Colab, along with a backup gist of the same.
This ran within Google Colab's free environment and I've also tried running the same with similar results from MacOS with an M1 mac that includes 8 available cores. Am I configuring
HighThroughputExecutor
andLocalProvider
appropriately to achieve parallel performance under these conditions? If not, is there something I can do to help increase the chances that parallel execution takes place?Thanks for any guidance you can provide! Please don't hesitate to let me know if I may better clarify.