Open Oufattole opened 2 months ago
Currently, a significant portion of data processing logic resides in the collate function. To improve efficiency, I should move this logic into the workers' code by running the process functions (like process_triplet) in the get_item function rather than sequentially in the collate function. This should distribute the workload more evenly and potentially improve performance.
There's a noticeable speed issue in the current implementation. Investigate the root cause of this performance bottleneck and implement optimizations to improve overall system speed. Likely this can be improved through synergy with the NRT repo.
Should get input from @mmcdermott