kheyer / Genomic-ULMFiT

ULMFiT for Genomic Sequence Data
283 stars 55 forks source link

State of the art? #4

Open lucidrains opened 4 years ago

lucidrains commented 4 years ago

Hi! Thanks for your great work. I was wondering if this pre-trained network is state of the art or close? I am interested in doing some pre-training with biological sequences, and could use some guidance. Thank you!

djinnome commented 4 years ago

Hi @lucidrains Genomic-UMLFiT is a good guide for pre-training with genomic sequences. If you are interested in guidance for pre-training protein sequences, I would recommend Unified rational protein engineering with sequence-based deep representation learning. You can download their pre-trained models here.

lucidrains commented 4 years ago

@djinnome Thanks for the response! I'm actually interested in both, and I'm aware of both UniRep and Berkeley's TAPE for protein sequences