Intelligent text-based search for Galaxy using machine learning
Supervisor: Björn Grüning (@bgruening) / Anup Kumar (@anuprulez)
For degree: Master (Project)
Status: Open
Keywords: Galaxy, Text-based search, Natural language processing, Machine learning, Search-engines
Global Biological/Research context
Galaxy is an open-source, web-based biological data processing platform. To process data, it offers thousands of tools and using these tools, numerous data-processing pipelines (workflows) can be created. Galaxy stores a huge collection of tools, workflows and (processed and raw) datasets. To find relevant items in a short time from a huge collection of data, an efficient/intelligent search is needed. To build such a search feature, few ideas from natural language processing and machine learning can be explored, implemented and compared.
Objectives of the project
Understand how Galaxy works - what are tools, workflows, datasets.
Explore relevant literature from search frameworks, natural language processing and machine learning.
2.1. Apache Solr
2.2. Elasticsearch
2.3. ....
Build a program which shows relevant search results (tools/workflows/datasets) based on a user query.
Analyze the results.
Write a report.
Integrate the program into Galaxy (depending on the time taken to finish).
Intelligent text-based search for Galaxy using machine learning
Supervisor: Björn Grüning (@bgruening) / Anup Kumar (@anuprulez) For degree: Master (Project) Status: Open Keywords: Galaxy, Text-based search, Natural language processing, Machine learning, Search-engines
Global Biological/Research context
Galaxy is an open-source, web-based biological data processing platform. To process data, it offers thousands of tools and using these tools, numerous data-processing pipelines (workflows) can be created. Galaxy stores a huge collection of tools, workflows and (processed and raw) datasets. To find relevant items in a short time from a huge collection of data, an efficient/intelligent search is needed. To build such a search feature, few ideas from natural language processing and machine learning can be explored, implemented and compared.
Objectives of the project
Prerequisites
Basic knowledge of:
Further reading and useful links