Update: Please check out our new biomedical QA system for BioASQ challenge.
The Biomedical Question Answering Framework provides an effective open-source solution to automatically finding the optimal combination of components and their configurations (configuration space exploration problem, or CSE problem) in building a biomedical question answer system (e.g. to respond to a question in TREC Genomics Track, What is the role of PrnP in mad cow disease?).
The BioQA framework is not just one particular QA system, but represents infinite number of possible QA solutions by intergrating various related toolkits, algorithms, knowledge bases or other resources defined in a BioQA configuration space.
The framework employs the topic set and benchmarks from the question answering task of TREC Genomics Track, as well as commonlyused tools, resources, and algorithms cited by participants. A set of basic components has been selected and adapted to the CSE Framework implementation by writing wrapper code where necessary, and users can also easily extend to wrap other existing tools or newly developped algorithms. This configuration space represented by the extended configuration descriptors (defined for the resulting set of configured components, e.g. default-sqlite-test.yaml, default-mysql-test.yaml, bioqa-test.yaml) can be explored with the CSE Framework automatically, yielding an optimal and generalizable configuration which can outperform published results of the given components for the same task.
GitHub home: https://github.com/oaqa/bioqa
Use it in your project: Artifact is publicly available in the OAQA Repository or Central Repository.
<dependency>
<groupId>edu.cmu.lti.oaqa.bio.core</groupId>
<artifactId>bioqa</artifactId>
<version>1.0.0</version>
</dependency>
Cite it in your paper
@inproceedings{Yang:2013,
author = {Yang, Zi and Garduno, Elmer and Fang, Yan and Maiberg, Avner and McCormack, Collin and Nyberg, Eric},
title = {Building Optimal Information Systems Automatically: Configuration Space Exploration for Biomedical Information Systems},
booktitle = {Proceedings of the 22st ACM international conference on Information and knowledge management},
series = {CIKM '13},
year = {2013},
location = {San Fransisco, CA, USA},
numpages = {10},
url = {http://dx.doi.org/10.1145/2505515.2505692},
doi = {10.1145/2505515.2505692}
publisher = {ACM},
address = {New York, NY, USA},
}
legalspan
and sentence
annotations with the legalspans.txt
file from the organizer and any sentence segmenter respectively using UIMA. Serialized the annotated CAS corresponding to each document to an XMI file.legalspan
and sentence
.)BIOQA_HOME/data/
. If you save it in a difference location or you change the username/password, you need to update src/main/resources/bioqa/persistence/local-sqlite-persistence-provider.yaml
.INDRI_URL
and INDRI_PORT
with your actual indri url and indri port in src/main/resources/bioqa/retrieval/default-sqlite.yaml
and src/main/resources/bioqa/ie/default-sqlite/yaml
.src/main/resources/bioqa/default-sqlite-test.yaml
and execute: mvn exec:exec -Dconfig=bioqa.default-sqlite-test
.src/main/resources/bioqa/persistence/local-mysql-persistence-provider.yaml
with your own url
, username
and password
.src/main/resources/bioqa/default-mysql-test.yaml
and execute: mvn exec:exec -Dconfig=bioqa.default-mysql-test
.XMI_DIR_PATH
with the directory or URL prefix that contains the annotated XMI files (or gzipped XMI files). For example, file:/PATH/TO/YOUR/XMIGZ/DIRECTORY
or http://URL:PORT/HTTP/SERVICE/URL/TO/PROVIDE/ACCESS/TO/REMOTE/FILES
.src/main/resources/bioqa/bioqa-test.yaml
similar to src/main/resources/bioqa/retrieval/default-sqlite.yaml
if SQLite or other persistence media is being used.Specify the main yaml and execute: mvn exec:exec -Dconfig=bioqa.bioqa-test
.
(See Section 6 of the [CSE paper][] for more detailed for component description.)
URL
and PORT
in src/main/resources/bioqa/async/cse-broker.yaml
, src/main/resources/bioqa/collection/db-collection-reader-consumer.yaml
and src/main/resources/bioqa/collection/db-collection-reader-provider.yaml
.inputelements
table of the databas, which will be retrieved directly from database while the program is being executed.JDBC_CONNECTION_URL
, USERNAME
, and PASSWORD
in both src/main/resources/bioqa/collection/db-collection-reader-consumer.yaml
and `src/main/resources/bioqa/collection/db-collection-reader-provider.yaml
.Please refer to OAQA Tutorial to learn how to create your own framework.
Copyright 2013 Carnegie Mellon University
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
If you have any questions or suggestions, please feel free to create an issue, or contact me.