UUDigitalHumanitieslab / AnnCor-scripts

A place for all the AnnCor scripts
MIT License
0 stars 0 forks source link

Set up parsing process (ORAC-server) #14

Open oktaal opened 7 years ago

oktaal commented 7 years ago

Alpino parsing is quite a time consuming operation, it could be nice if multiple sentences can be converted in parallel.

mhkuu commented 7 years ago

Not sure if this would help here, but I've noticed that running Alpino in server mode speeds up the parsing progress a lot. See the GrETEL-upload application for an example how to use that (in PHP).

Also, it might be worthwhile checking out the Alpino API repository, they also have Alpino running in server mode (actually, six instances of Alpino).

oktaal commented 7 years ago

It's definitely a good idea to look into this first, before we reinvent the wheel!

oktaal commented 7 years ago

The idea is now to just parse all the files using the ORAC server and have them ready for review.

oktaal commented 7 years ago

Plan for now:

  1. Upload CHAT-files using Gretel-upload. This will parse everything.
  2. Once this is done, download the results using https://github.com/UUDigitalHumanitieslab/GrETEL-upload/issues/5