Closed dvg-p4 closed 4 months ago
Same issue if the input is a plink2 fileset:
$ plink2 --bfile input --make-pgen --out p2_input
[...]
Writing p2_input.psam ... done.
Writing p2_input.pvar ... done.
Writing p2_input.pgen ... done.
End time: Thu Jul 18 21:06:36 2024
$ plink2 --pfile p2_input --keep ids.txt --make-pgen --out output
PLINK v2.00a5.12LM AVX2 Intel (25 Jun 2024) www.cog-genomics.org/plink/2.0/
(C) 2005-2024 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to output.log.
Options in effect:
--keep ids.txt
--make-pgen
--out output
--pfile p2_input
Start time: Thu Jul 18 21:07:04 2024
380297 MiB RAM detected, ~273639 available; reserving 190148 MiB for main
workspace.
Using up to 96 threads (change this with --threads).
10 samples (10 females, 0 males; 10 founders) loaded from p2_input.psam.
10 variants loaded from p2_input.pvar.
1 binary phenotype loaded (5 cases, 5 controls).
--keep: 0 samples remaining.
Error: No samples remaining after main filters.
End time: Thu Jul 18 21:07:04 2024
...it DOES work if there is no FID column in the main dataset input, though:
$ cp p2_input.pvar no_FID.pvar
$ cp p2_input.pgen no_FID.pgen
$ awk 'BEGIN {FS = "\t"; OFS = "\t"; printf "#"}; {print $2, $3, $4}' p2_input.psam > no_FID.psam
$ head -n3 no_FID.psam
#IID SEX PHENO1
per0 2 1
per1 2 2
$ plink2 --pfile no_FID --keep ids.txt --make-pgen --out output
[...]
--keep: 3 samples remaining.
3 samples (3 females, 0 males; 3 founders) remaining after main filters.
1 case and 2 controls remaining after main filters.
Writing output.psam ... done.
Writing output.pvar ... done.
Writing output.pgen ... done.
https://www.cog-genomics.org/plink/2.0/input#sample_id_convert
IID-only means FID is treated as 0.
Conditions
With a main dataset that contains FID information, do a
--keep ids.txt
operation, where theids.txt
file contains a single column of individual IDs, optionally with#IID
header.Expected behavior
The dataset will be filtered to only those IIDs, without regard to FID.
Observed behavior
No samples are matched, plink2 errors out with "Error: No samples remaining after main filters."
Full reprex