korpling / ANNIS

ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with diverse types of annotation.
http://corpus-tools.org/annis/
Apache License 2.0
68 stars 25 forks source link

corpus export to graphml #834

Closed josefkr closed 1 year ago

josefkr commented 1 year ago

Is your feature request related to a problem? Please describe.

The Annis user guide for the current version of Annis says

""" ANNIS supports two types of formats to import: the legacy relANNIS format based on the previous relational database implementation, and a new GraphML based native format which can be exported from ANNIS and exchanged with other tools which support GraphML (like e.g. Neo4j). """

However, in the standalone version of ANNIS that I'm running I haven't been able to find such an export facility in the UI. I would have expected it to show up somewhere in the administration section as an operation on a corpus. But I only see an option to delete a corpus.

Describe the solution you'd like

I would like a pointer to where this functionality can be accessed, if it is implemented.

Describe alternatives you've considered

As far as I can see, pepper doesn't offer an export to graphml. If there were any other code that could take data from the relannis format to graphml (maybe via salt) that would help me out.

Additional context Add any other context or screenshots about the feature request here.

MartinKl commented 1 year ago

Hi @josefkr,

I am not aware of any options to do that via the UI (which does not mean it's impossible), but you can run the ANNIS backend (graphANNIS) to do what you want.

I assume you are currently using ANNIS Desktop. What you need is graphANNIS binaries. You then have two options:

1) run the graphANNIS cli on an empty folder, reimport your relannis corpus, load it, and export it; OR 2) run the cli directly on your current ANNIS database folder (make sure ANNIS Desktop and it's background service is not running)

On Ubuntu (or similar), you'd do those two things the following way (from the directory where you run the binary):

1)

mkdir data
./annis data/ -c 'import RELANNIS_DATA.zip' -c 'corpus CORPUS_NAME' -c 'export GRAPHML_FILENAME.zip'

You can drop the path argument for the export command to get an unzipped graphml file, but note that you might get a directory for potential configurations in addition, so I recommend to go for a zip file. Also, an existing export file will be overwritten in any case.

2)

./annis ~/.annis/v4/ -c 'corpus CORPUS_NAME' -c 'export GRAPHML_FILENAME.zip'

Your database directory is writable, so if you want to experiment a little more, better go for option 1.