josiahseaman / FluentDNA

FluentDNA allows you to browse sequence data of any size using a zooming visualization similar to Google Maps. You can use FluentDNA as a standalone program or as a python module for your own bioinformatics projects.
65 stars 7 forks source link

Machine learning backend #67

Open photomedia opened 6 years ago

photomedia commented 6 years ago

Machine learning backend was mentioned in https://github.com/josiahseaman/FluentDNA/issues/63

This is something that I have thought about a lot. It is a very interesting direction to pursue with the software. In fact, I think we may want to put in a Watson & Crick style sentence somewhere in our submission, something like:

"It has not escaped our notice that the image files generated by our software immediately suggest a possibility of using machine learning to characterize the sequence data as images."

josiahseaman commented 6 years ago

Definitely, I have put a fair amount of thought into this as well. I have a sequence categorization prototype that could be used to expand an annotation.

Deep Variant
https://www.biorxiv.org/content/early/2018/03/20/092890 Is a clear demonstration of how deep learning image classification can be applied to sequences. Of course they're not exactly the same kind of images.