Closed Fnyasimi closed 2 years ago
Yes, same order as in df_beta
.
No, pseudovalidation has not been reimplemented since I have not found it to be robust enough (cf. preprint).
Thank you for the response.
I would like to use the precomputed LD reference and am not sure if I should match the summary stats to the ld reference or the validation set or both. Do I use the approach described here when using the precomputed ld reference?
Yes, you should match both to the LD reference.
You can do a quick filter to keep only the variants that you also have in the validation/test set (in_test
) for deriving the PGS, and then do the second snp_match()
later.
My matching has been an issue I am not sure what am doing wrong but my GWAS ss, LD ref and test set contain a overlapping set of SNP but when I try to get the pred_grid
I end up with NAs. I am not sure if this is a result of mismatch in the _NUM_ID_
between the refLD and the test set on a bug. I have also tried the approach explained in this #318 but still getting NAs. What could be the issue?
NAs in the effects sizes produced by lassosum2 and LDpred2-grid corresponds to models that completely diverged.
But if you have NAs in the predictions you get with LDpred2-auto, it must be that you have some NAs in the genotype matrix of your test set.
Thanks for the feedback I have done a few QC steps.
I imputed the Genotype using this function G2 <- snp_fastImputeSimple(G, method = "mean2", ncores = nb_cores())
I checked the betas for lassosum2 and LDpred2-grid they don't contain NA
s though some columns have only 0
s.
When I run prediction using the big_prodMat
function I end up with NA
s in my prediction. But when I run the same analysis limiting the input data to chromosome 1 and 2 I get the results, am not sure what happens when I scale it up to all chromosomes. Any ideas on things I could look out for?
Also a quick question what mappings do you to convert the genomic coordianates from pos
to cM
?
Which version of {bigsnpr} are you using yo get columns with only 0s?
Are you sure you're using G2
in big_prodMat()
?
Look at the function snp_asGeneticPos()
.
I am using bigsnpr v1.9.11
Yes I am using the G2
in big_prodMat()
If you have NAs out, you must have NAs in.. I don't see any other explanation.
You can check anyNA(beta)
and counts <- big_counts(G2); sum(counts[4, ])
.
Any update on this?
No updates so far. I will get back at it later on and try to find the problem. Feel free to close the issue and if I get an update I can comment on it. Thanks!
Hi @privefl thank you for providing lassosum2 reimplementation. I have been using lassosum to run my analyses and would like to try the lassosum2 to check for improved performance.
I have looked at this tutorial and I would like to confirm;
Does the
best_grid_lassosum2
contains the betas for all the SNPs indf_beta
and in the same order? I would love to extract all the non-zero betas and their rsid or chr::pos and allele info for further use downstream.Is there an equvalent of
pseudovalidate
method re-implemented in lassosum2 in scenarios where by we don't have the observed phenotype?