The program is being run with a sm-list whose samples are all in the evac table (4789 samples). The VCF has slightly more samples 4815 and generates a FATAL error. I was expecting the subset to occur first and then a check for that the VCF samples have PCs. This is easy enough to get around by subsetting with bcftools view -S gcad_samples_4789.txt but it would be nice to be able to select the subset with ruth.
ruth --vcf adsp5k.manta.indels.norm.vcf.gz --evec adsp5k.evec --field PL --out adsp5k.manta.indels.norm.ruth.4789.vcf.gz --sm-list gcad_samples_4789.txt
Available Options
The following parameters are available. Ones with "[]" are in effect:
Input Options : --evec [adsp5k.evec],
--vcf [adsp5k.manta.indels.norm.vcf.gz],
--thin [1.00], --seed,
--num-pc [4], --field [PL],
--gt-error [5.0e-03],
--lambda [1.00]
Output Options : --out [adsp5k.manta.indels.norm.ruth.4789.vcf.gz],
--skip-if, --skip-info,
--site-only, --nelder-mead,
--lrt-test, --lrt-em
Samples to focus on : --sm-list [gcad_samples_4789.txt]
Parameters for sex chromosomes : --sex-map, --x-label [X],
--y-label [Y], --mt-label [MT],
--x-start [2699520],
--x-stop [154931044]
Options to specify when chunking is used : --ref, --unit [2147483647],
--interval, --region
Run with --help for more detailed help messages of each argument.
NOTICE [2019/11/05 21:34:32] - Analysis Started
NOTICE [2019/11/05 21:34:32] - Reading sample eigenvectors
NOTICE [2019/11/05 21:34:32] - Identifying sample columns to extract..
NOTICE [2019/11/05 21:34:32] - Reading in BCFs...
NOTICE [2019/11/05 21:34:32] - Finished identifying 4789 samples to load from VCF/BCF
FATAL ERROR -
[E:/share/pkg.7/ruth/git778d784/src/ruth/frequency_estimator.cpp:121 bool frequency_estimator::set_variant(bcf1_t*, int8_t*, int32_t*)] nsamples 4815 != 4789 in the EigenVector
The program is being run with a sm-list whose samples are all in the evac table (4789 samples). The VCF has slightly more samples 4815 and generates a FATAL error. I was expecting the subset to occur first and then a check for that the VCF samples have PCs. This is easy enough to get around by subsetting with bcftools view -S gcad_samples_4789.txt but it would be nice to be able to select the subset with ruth.