Where we read from many different h5 datasets (not just X, y, but also transitions, coverage, spliced coverage, coverage score, someday phase...) we can bottle neck (particularly for smaller networks) on the data read in.
Check if the H5 files can be restructured to make for less random reads / keep data from different datasets but same index in a way it's easy to get all relevant data by index. Then pro/con and decide if it's worth the effort and repercussions to change...
Where we read from many different h5 datasets (not just X, y, but also transitions, coverage, spliced coverage, coverage score, someday phase...) we can bottle neck (particularly for smaller networks) on the data read in.
Check if the H5 files can be restructured to make for less random reads / keep data from different datasets but same index in a way it's easy to get all relevant data by index. Then pro/con and decide if it's worth the effort and repercussions to change...