Protein
, DNA
, and RNA
. No relationsgenic
interaction (to anything)Protein
to RNA
relations (reason, more interesting biologically & we don't have to deal with the problem of distinguishing Protein vs Genes)Protein
to DNA elements
(very specific cases... TODO)genia.mod.tar.gz
from resources/
and copy it to src/main/resources/gimli/resources/tools/gdep
gradle assemble
Navigate to build/libs
and run the tool with java -jar mthesis-ashish-*.jar
with the following arguments:
-i
path-to-input-file-o
path-to-output-file-w
writer-format - anndoc or json-r
reader-format - can be txt or iob2 (optional argument)-s
string - text to tag given from command line (optional argument)If reader format (-r)
argument is not given, the program will guess based on the extension.
If string (-s)
argument is given, the program will automatically read from the command line for an input.
Input format : iob2, -r
specified, writer format : anndoc
java -jar build/libs/mthesis-ashish-*.jar -r iob2 -i sample/iob2/corpus1.iob2 -w anndoc -o sample/iob2/output1.ann.json
Input format : iob2, -r
not specified, writer format : json
java -jar build/libs/mthesis-ashish-*.jar -i sample/iob2/corpus1.iob2 -w json -o sample/iob2/output1.json
Input format : txt, -r
specified, writer format : anndoc
java -jar build/libs/mthesis-ashish-*.jar -i sample/txt/corpus2.txt -r txt -w anndoc -o sample/txt/output2.ann.json
Eventually, the jar file will be called relna.
If the file is in txt
format, it must follow the pattern below.
<identifier>
<title>
<abstract>
If the file is in iob
format, it must follow the pattern below.
<identifier>
<sentence begin>
.
.
<token>
.
.
<sentence end>
<sentence begin>
.
.
.
<sentence end>