hltfbk / Excitement-Transduction-Layer

1 stars 1 forks source link

Running BIUTEE #178

Closed sharonshabtai closed 10 years ago

sharonshabtai commented 10 years ago

Please guide me to the beauty of BIUTEE.

LiliKotlerman commented 10 years ago

OK, let's do it step by step :) Just to underline, before we proceed - TL is more or less agnostic to what LAP and EDA you are using and how they were trained. So, to use all the power of the EOP you'll need to learn how to train BIUTEE on your data and how to feed the TL with the models you've got. You have detailed instructions on BIUTEE here: https://github.com/hltfbk/Excitement-Open-Platform/wiki/BIUTEE

Below I'm listing several inputs that will help you have it running with a predefined configuration and models trained for that configuration.

  1. First, you need the EOP resources directory. You can download it here http://hlt-services4.fbk.eu:8080/artifactory/repo/eu/excitementproject/eop-resources/1.0.2.tar/eop-resources-1.0.2.tar.gz
  2. Now you need to start the EasyFirst parser's server on your machine (you'll need to start it every time you want to use BIUTEE) - go to the the EOP resources directory you just unpacked, eop-resources-1.0.2\BIUTEE_Environment\workdir and run the runeasyfirst bat or shell script.
  3. Next, you need to edit the configuration file of BIUTEE. You can start from the one we have in the TL under .\src\test\resources\NICE_experiments\biutee_wp6.xml. In the configuration file you need to change every path to eop-resources-1.0.2/ to point at the directory you downloaded. You need to indicate where EasyFirst is running (easyfirst_host and port) - if it's on your local machine, no need to make changes. For the test run, use the model files provided with the TL (just update the path to each) - "search_model" and "predictions_model". They fit the transformations and resources listed in this configuration file, so there's no need to change anything else. When you start training your own models, you'll need a match between the model files and the configuration they were trained for. When you train your own models, always use the model_search_1.xml and the model_predictions_2.xml from the output of the training process.
  4. Now, when the configuration is fixed and easyfirst is running, we can try using BIUTEE from the TL. To do so, you'll need to run the UseCaseOneDemo with the following parameters ( String with your biutee configuration file name, String with your input folder (xmi files, as before), int file limit, String with your output folder, BIUFullLAP.class, BiuteeEDA.class)
  5. In order to run TL with BIUTEE from Eclipse, you'll have to change the working dir of your run to the BIUTEE_Environment directory under the resources folder you've downloaded at step 1 (eop-resources-1.0.2\BIUTEE_Environment). You can do it from Run configurations -> Arguments tab -> Working directory (at the bottom of that tab). If you're running from command line, run from the BIUTEE_Environment folder.

Note that as soon as you change your working dir, you'll need to fix the relative path to each of the files in step 4.

  1. If you don't succeed, and you're running on Windows, make sure that cygwin is part of your PATH system variable. If this does not help, add \BIUTEE_Environment\workdir\tokenizer.exe to your PATH. For unix, make sure the file BIUTEE/third-party/nagel_sentence_splitter/linux_64/tokenizer is in your system's path. For example, by copying it to /usr/bin.
  2. If you still have problems, tell me, I'll try to help :) Note that we still have some problems on fragment graph generation level, which are not resolved yet, so for your test run use UseCaseOneDemo with the following parameters ( String with your biutee configuration file name, /src/test/resources/WP2_public_data_CAS_XMI/nice_email_3, 15, String with your output folder, BIUFullLAP.class, BiuteeEDA.class) - this is what I manage to run successfully.

Good luck!

sharonshabtai commented 10 years ago

Thank you Lili :)