cooperative-computing-lab / cctools

The Cooperative Computing Tools (cctools) enable large scale distributed computations to harness hundreds to thousands of machines from clusters, clouds, and grids.
http://ccl.cse.nd.edu
Other
133 stars 115 forks source link

vine_plot_txn_log: incorrect cachetime for tasks #3165

Open David-Simonetti-ND opened 1 year ago

David-Simonetti-ND commented 1 year ago

When using vine_plot_txn_log, all tasks/duties started on one worker share the same input transfer time. I discussed with Colin last week, and it was intended for workflows where each worker was only given one task. In my case, only the duties have input transfers, but every task is showing up as having an input transfer.

btovar commented 1 year ago

@David-Simonetti-ND, even functions have input transfers, right? (E.g. the file that encodes the arguments to the function.)

David-Simonetti-ND commented 1 year ago

They have input transfers but no cache updates, since the file that is used to send the arguments isn't put in the cache. I think the plotting tools only looks at the cache updates to plot the input transfers.

dthain commented 1 year ago

Hmm, I think we have several things that need to be described here:

1 - Input transfers from the manager, which may or may not be cached. Those that are not cached still go into the cache directory, but are deleted by the manager when the task is done. 2 - Transfers performed asynchronously by the worker. 3 - Mini-tasks performed asynchronously by the worker, which might also be transfers. 4 - Output files created by completed tasks.

Now, 1 does not result in a cache-update message, because the manager knows exactly what's going on. But 2-4 do result in a cache-update message, because the manager needs to be told when the worker has done something independently.

Perhaps the problem here is that we are making a direct relationship between cache-update messages and transaction log entries. i.e. We could have different log entries for manager-worker transfer, worker-worker transfer, etc.

btovar commented 1 year ago

The log does not make a relation between cache-updates and tasks. The cache update is in the log as a worker-manager transaction. The updates to the plotting tool in #3175 graph that as such. With #3176, we should be able to plot task inputs per task only when bytes_sent > 0.