chrchang / plink-ng

A comprehensive update to the PLINK association analysis toolset. Beta testing of the first new version (1.90), focused on speed and memory efficiency improvements, is finishing up. Development is now focused on building out support for multiallelic, phased, and dosage data in PLINK 2.0.
https://www.cog-genomics.org/plink/2.0/
416 stars 126 forks source link

pgenlibr: multi-threading? #249

Open agilly-regn opened 1 year ago

agilly-regn commented 1 year ago

Hi,

I am wondering whether, by default, pgenlibr uses multithreading the same way plink2 does? We are comparing performance between pgenlibr and wrapped plink2 calls in R and are seeing decreased performance in the former. If by default pgenlibr is single-threaded, how can we enable multithreading?

Thanks!

chrchang commented 1 year ago

The C++ pgenlib code does not contain any multithreading of its own (unless you count the isolated .pvar loader); it only includes some low-level constructs which are practical for plink2 to build multithreading logic on top of. The pgenlibr R package does not currently contain this sort of multithreading logic.

It is possible to modify pgenlibr's large-matrix-filling functions to imitate what plink2 does, though that would take a significant amount of work.