Open varchanaiyer opened 6 years ago
Hi @Dawny33 @rishikksh20 - can you please review this? Thanks!
@ArchanaIyer1996 The topic sounds very promising.
However, except for point-1, the others don't seem to have a link with cancer detection. Possible to elaborate how you are planning to link the topics with the central theme?
@Dawny33 Hi, Thanks a lot. The first few slides will define what sort of data are we dealing with. In my case, I had humongous data(about 200+GB) of cpg sites and their percentage of methylation. But following that, I am planning to explain how do you tackle such a large dataset. The purpose of using machine learning for this problem will be described in the first half of the talk only.
You can look at slides here
@ArchanaIyer1996 The slides look quite comprehensive enough. ML in healthcare is very rarely talked about, so this talk would hold a lot of value. Good luck!
Thanks a lot! I really hope it makes an impact! @Dawny33
@MSanKeys963 what's the update on the next meetup?
Hi @ArchanaIyer1996 . Can you deliver this at our upcoming meetup?
Hi @MSanKeys963 I will be available in Delhi from the 28th of October to the 13th of November. Let me know if you have any meetups coming up, will try for sure to make it!
I'll be sure to notify for upcoming meetups during that time.
Abstract (2-3 lines) There is a strong correlation between a person's DNA and the diseases they might have. This becomes especially important in the case of Cancer where traditional methods to detect cancer can take days and may not be accurate. In this talk, I will show you how you can train a CNN to detect cancer using genetic data. Since DNA data can be huge, this talk will also be about how you can optimize your code to handle large datasets (>200 GB) without it being a bottleneck during your training.
Brief Description and Contents to be covered Part 1 - What DNA data looks like. How DNA is related to cancer. How to obtain open-source DNA data Part 2 - How you go about starting an AI project: Obtaining data, Preprocessing data, training a model, improving the model, testing and inference Part 3 - How to make Data Preprocessing faster using numpy, dask and numba. How to deal with large datasets and how to store it in memory. Part 4 - Demo of the techniques Part 5 - Implementing papers and creating your own models using TensorFlow Part 6 - Demo of the model and implementation Part 7 - Concluding with future possible work and resources to get started. Part 8 - Q/A
Pre-requisites for the talk: 1) Elementary knowledge of Deep Learning 2) Python and TensorFlow
The time required for the talk: The talk will be of 30 mins with 5 mins of Q/A in the end. The following are the parts to it: 1) DNA Data and Cancer Introduction - 1 min 2) The process of working on an ML project - 3 mins 3) Data, Data Preprocessing and Dealing with large datasets - 10 mins 4) Model Architecture and Demo - 10 mins 5) Conclusion - 1 min 6) Q/A - 5 mins
Link to slides https://slides.com/archanaiyer
Will you be doing a hands-on demo as well? Yes, I will be showing a demo of tricks to deal with large datasets and a demo of the model.
Link to ipython notebook (if any) You can view my paper submitted to arxiv
About yourself I am a fresher from SRM Institute of Science and Technology. I understand that engineering is not everyone’s cup of tea and that everyone has a different perception of it. During my second year of study, I realized that for me education was something that was present beyond books and into practical applications. So I collaborated with a few other mates in college and started this place called the Next Tech Lab which was involved in cutting-edge innovation and novel research ideas.
As a few of my achievements that the lab made me achieve included winning the Smart India Hackathon 2017 as the first prize under Ministry of Steel for using machine learning to detect power theft in India. Recently I was invited to the WiPDA conference in Xi’an China for presenting my work in GaN modeling of devices using machine learning, a collaboration with the University of Cambridge. I have around 3 IEEE Xplore Papersand 1 Elsevier papers for my contribution to electrical and machine learning fields
As a lab, we have done so much more to protect gender diversity even among the strength of 200 members keeping a ratio of 50:50. We were portrayed for accomplishments by the News 18 in a short video.
Over the past 6 months, I have had the opportunity to work and intern at Saama Technologies where I research on Machine Learning in order to accelerate clinical trials. A part of this work has exposed me to how machine learning models are necessary to be used in various genomics fields.
You can view my Google Scholar citations here
*You can view my blogs
You can view my various slides here
Are you comfortable if the talk is recorded and uploaded to PyData Delhi's YouTube channel? Yes
Any query ?