AI-sandbox / gnomix

A fast, scalable, and accurate local ancestry method.
Other
81 stars 13 forks source link

Can pre-trained models be used on GRCh38 data? #43

Open mojoman666 opened 12 months ago

mojoman666 commented 12 months ago

I am interested in using GNomix for local ancestry inference on my phased WES data. I understand that the pre-trained models used GRCh37 data for training. My data is in GRCh38. What I'm trying to understand is whether GNomix uses any positional information for the predictions in a way that would make it dependent on a genome build? Is it necessary to retrain the model from scratch using GRCh38 reference in order to use it on my data, or can I use the pre-trained models for my GRCh38 data out of the box? Any advice is highly appreciated.

dralhindi commented 4 months ago

I've used gnomix on hg38 aligned data before and have not had a problem. I constructed a reference panel and train the models based on the reference panel. Then I use that for local ancestry inference on the query. I hope that helps!