Dependencies for later tasks are tracked as a single tensor_data_task. We should add differentiation for write hazards and read hazards, since storing it as one item doesn't allow two operations that only need to read tensor data to happen simultaneously (e.g. computing hashes while also writing data to disk, when encryption isn't active). This is less harmful to performance when queueing operations in batch order like now rather than per-tensor, but any time that there are enough threads (or few enough tensors) to handle tasks from multiple stages at once, this would unblock the later stages sooner.
From @Eta0 in https://github.com/coreweave/tensorizer/pull/127#pullrequestreview-2133569874
Dependencies for later tasks are tracked as a single
tensor_data_task
. We should add differentiation for write hazards and read hazards, since storing it as one item doesn't allow two operations that only need to read tensor data to happen simultaneously (e.g. computing hashes while also writing data to disk, when encryption isn't active). This is less harmful to performance when queueing operations in batch order like now rather than per-tensor, but any time that there are enough threads (or few enough tensors) to handle tasks from multiple stages at once, this would unblock the later stages sooner.