Closed cannin closed 1 year ago
Hi @cannin , I am Kartik Kumar Pawar, a CSE sophomore at BITS PILANI. I have good experience using python for about 6 years.I am also adept in JAVA with knowledge of OOPS and basic design patterns,I have also worked with both SQL and NoSQL database systems.I am familiar with javascript and have worked with React and nodeJS as well. I am really excited to know more about this project and contribute to it, with the aim of becoming a GSOC 22 contributor as well. I kindly request you to guide me for the same so I can start as soon as possible.
Hi @cannin ! I am interested in this project, having used PyTorch and Tensorflow extensively for my projects. I also have strong knowledge of Python as I have done a Research Internship involving the use of Python for data analysis. I have made websites with Django as well, and know the basics of Java with good knowledge of OOPS.
Should we directly submit a proposal using the Google Doc template or do we have to pass evaluation tests as well?
NRNB has officially been accepted as a mentoring organization for GSoC 2022! Here are some useful links:
@GunjanDhanuka @Kartikkp07 Apologies for the delay. If you are still interested the next steps are to work on a proposal. The proposal should be as detailed as possible: what data transformations will be needed, example code for using the transformer, plus the various points listed under the "Goal" section.
Sure!!Thanks for the update, will start working on it asap.
My name is Kaitlyn and I am a first-year undergraduate student majoring in Computer Science and Mathematics at the University of Georgia. I am very interested in joining NRNB for the summer and making contributions to the code. In terms of programming experience, I am familiar with Java, Python, and JavaScript. For bioinformatics data analysis, my skills/techniques include data cleaning, data visualization, Numpy, Pandas, skicit-learn, PyTorch, and TensorFlow. I have experience working in a research lab where I analyze genome data in crop plants and make a classifier for single-cell genomes for better gene expression regulation using skicit-learn. I am interested in helping utilize the T5 Text-to-Text Transformer to generate descriptions of interactions from pathway commons, as I think my skill set would fit this perfectly. I am very excited about the applications of computer science in biology, and I am looking forward to being a part of your team for the summer. Please let me know if I can work with you!
A reminder that the application period opens on Monday April 4. Proposals to NRNB must be submitted on the official GSoC Site (https://summerofcode.withgoogle.com/) before April 19, 18:00 UTC to be considered, and contributors are encouraged to submit proposals in draft format early, so that mentors can give feedback directly at the GSoC site.
@khanhcodes and anyone else if you are interested and have worked on a proposal for this project. i'm willing to give comments before the deadline. create a google doc and send via email (see Potential Mentors section).
IMPORTANT REMINDER: GSoC 2022 is for new “beginners” to open source.
Applicants are expected to review eligibility requirements prior to applying. We can not accept applications from contributors with prior open source development experience. From the GSoC FAQ https://developers.google.com/open-source/gsoc/faq:
Can someone already participating in open source be a GSoC Contributor?
The goal of GSoC is to bring new contributors into open source organizations. GSoC can also help beginner contributors learn the ins and outs of open source while being mentored by experienced community members. GSoC is for new and beginner contributors to open source, it is not for experienced contributors to open source.
Closing in preparation for GSoC 2023.
Background
Data structured in databases is very useful for many applications, but text is a primary way that humans communicate on the internet. Being able to present query results as sentences can help humans understand rich datasets.
Data
Pathway Commons (http://pathwaycommons.org/) is an aggregated database of molecular interactions of millions of interactions. Data stored in the Pathway Commons is in the BioPAX (http://biopax.org/) XML-based format. The data is aggregated from a collection of approximately 20 databases. Data from Pathway Commons is accessible via the Java Paxtools (https://biopax.github.io/Paxtools/) library.
Algorithm
Neural network transformers such as T5 (https://huggingface.co/docs/transformers/model_doc/t5) are able to convert input text-based content into other forms, such as sentences. Challenges such as WebNLG (https://webnlg-challenge.loria.fr/) have sought to standardize input representations of text content for use with transformers for the generation of text.
Goal
The goal will be to generate code to:
Difficulty Level: Medium
Size and Length of Project
Size: 175 hours Length: 12 weeks
Skills
List skills/technologies that the student should be familiar with. Also tag the issue with these.
Essential skills: Python, PyTorch Nice to have skills: Java (basics)
Public Repository
Potential Mentors
Augustin Luna