google / deepvariant

DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
BSD 3-Clause "New" or "Revised" License
3.22k stars 724 forks source link

Adding the non human model? #204

Closed ghost closed 5 years ago

ghost commented 5 years ago

Hello,

I apologise if my question is naive I am a beginner with neural networks. Do you plan to release models trained with non-humans? Like the mosquito analysis published on your blog? And would it made sense to have a kind of "universal model"?

thank you

pichuan commented 5 years ago

Hi, Thanks for your question! In our blog post you mentioned, we showed an example of further training a model to perform better on mosquito data, which has higher variant density than human.

Since then, internally we have continued to investigate what properties of the human genome and population structure DeepVariant learns during training. We’re hoping to come up with a suggested non-human model soon, but do not yet have a specific timeframe for releasing such a model.

In terms of having a "universal" model, that is a good question too! We are currently investigating whether we can reduce the number of models we release while maintaining the high accuracy. Ideally if we can train one model that works well for all scenarios, we will certainly do that. Currently we’re optimizing our model accuracy for each common application, while keeping the number of released models as low as we can.