One large table with project_guids kept as is, and project_stats migrated to an array of structs (rather than an array of arrays), where each struct is the sum of the ref_samples, het_samples, hom_samples for a project.
A lookup table for every project, with a family_guids global and a project_stats array of structs, each index representing a family.
Pipeline logic to update both the project lookup table and the large lookup table whenever a project is loaded.
Airflow logic to make sure that the lookup table but not the project lookup tables are copied to ssd.
Re-architect the lookup table:
project_guids
kept as is, andproject_stats
migrated to an array of structs (rather than an array of arrays), where each struct is the sum of theref_samples
,het_samples
,hom_samples
for a project.family_guids
global and aproject_stats
array of structs, each index representing a family.