Closed Jingning-Zhang closed 1 year ago
Hi @Jingning-Zhang Thank you for reaching out, and apologies for the long delay in responding. Unfortunately, I wasn't able to reproduce your error with our test dataset.
If you are still having trouble with this and would like assistance, I have some follow-up questions.
Additionally, we have updated that function since the blog post went out and this issue helped us notice that the current blog post code will not work with the updates. It doesn't look like that was the issue here though because the line indicated in your error num_out_cols = min([len(x) for x in pop_pc_pd["pca_scores"].values.tolist()])
no longer exists in the new code, so your error message would be different. I have put in a PR to modify the new code so the blog post example can work.
If you try this again with the newest version of gnomad_methods
(v0.6.3), you would need to modify the code in the following way:
This:
htt, rf_model = assign_population_pcs(
ht,
pc_cols=ht.scores,
fit=fit,
)
should change to:
htt, rf_model = assign_population_pcs(
ht,
pc_cols=range(num_pcs),
fit=fit,
)
This should be fixed in the next release so the original blog post code can work as expected.
Thanks, Julia
The newest release v0.6.4 has fixed the assign_population_pcs
function so that it still works with the code in the blog post. Closing the issue since I think this is now fixed, but please reach out if it's still a problem.
What you did:
We want to use the ancestry classification tool to predict genetic ancestry of UKBB samples. (https://gnomad.broadinstitute.org/news/2021-09-using-the-gnomad-ancestry-principal-components-analysis-loadings-and-random-forest-classifier-on-your-dataset/)
What happened:
There is an error of "TypeError: object of type 'NoneType' has no len()" when I finally did the prediction.
Below I attach all my codes and their outputs.