Further match_variants optimisations

EFO	N Scores	N_variants	N_unique (chr/pos/eff/oth)	% unique
Autoimmune	60	12949261	8494054	0.66
CVD	125	89802593	16654741	0.19

Does it speed it up if we:

Another way to do this may be to make a wide DF (e.g. each PGS is a column), do the matching, then labelling, then splitting?

Running the matching in series (e.g when genotyping data is split by chromosome) is slow and makes the wall time of the pipeline very long. We should spawn parallel match_variants and then aggregate the logs. [This will save on memory and wall-time, implementation will partially be within pgsc_calc]

PGScatalog / pgscatalog_utils