acoli-repo / conll-rdf

Advanced graph rewriting and LLOD publication for CoNLL and other TSV formats
25 stars 9 forks source link

revise `compile.sh` #73

Closed chiarcos closed 2 years ago

chiarcos commented 2 years ago

We support compilation with maven (default) and direct compilation (compile.sh, for compilation in built scripts). The latter got out of sync and was broken. It now resorts to maven. However, this is a preliminary hack, as it does re-build everything at every call, logging is verbose and it doesn't seem to find the test directory.

TODO: Fix maven call ;)

For a sample call, run

$> git clone https://github.com/acoli-repo/conll-transform.git
$> cd conll-transform
$> ./transform.sh CoNLL-U CoNLL-12

If it doesn't find a conll-rdf directory, it clones and re-builds it from scratch. Try that to get the logging.

(See CoNLL-Transform Issue #5)

cfaeth commented 2 years ago

We should consider to remove support for legacy (non-maven) compilation, specifically to also merge the fintan-support branch into master:

Question: do we still need support for legacy, non-maven javac compilation, support for heterogeneous Cygwin environments etc?

chiarcos commented 2 years ago

removing lib/: +1 relying on maven: +1 removing compile.sh: -1 / instead: can we just make it call maven?

we need compile.sh for old workflows with run.sh (and its derivatives) as some of these include calls to compile.sh. These are mostly in-house pipelines, but many of them.

Am Fr., 3. Dez. 2021 um 16:07 Uhr schrieb cfaeth @.***>:

We should consider to remove support for legacy (non-maven) compilation, specifically to also merge the fintan-support branch into master:

  • we can remove the lib folder (and do not need to provide a precompiled Fintan core lib there, which would need manual updates)
  • we can fully rely on Maven modules for a unified build script.
  • This would greatly simplify and stabilize the project.

Question: do we still need support for legacy, non-maven javac compilation, support for heterogeneous Cygwin environments etc?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/acoli-repo/conll-rdf/issues/73#issuecomment-985594713, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATZWSJKPIPAQ4WJHVYOZJDUPDMLNANCNFSM5INULK7A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

leogott commented 2 years ago

Just for clarification: I will rewrite compile.sh to call maven (with certain flags like --batch-mode and --quiet, intended to make operation from inside a script reliable). Intended behavior is

leogott commented 2 years ago

I tried to check out conll-transform and ran into a few problems. I tested it by checking out #76 and re-adding the lib folder manually (git checkout main -- lib/ might work) and then running java -Dfile.encoding=UTF8 -cp bin/:conll-rdf/lib/* org/acoli/conll/transform/Transformer -version 1 CoNLL-U CoNLL-12 conll-rdf/owl/conll.ttl. It seems to work?

~/conll-transform$ java -Dfile.encoding=UTF8 -cp bin/:conll-rdf/lib/* org/acoli/conll/transform/Transformer -version 1 CoNLL-U CoNLL-12 conll-rdf/owl/conll.ttl synopsis: Transformer [-silent] [-help] [-version VERSION] SRC TGT OWL [BASEURI] -silent suppress this message -help show this message and quit -version specify CoNLL-RDF version (1 or 2). Defaults to 1 SRC source format TGT target format OWL CoNLL-RDF ontology (or a replacement that defines one or more conll:Dialect objects BASEURI base URI for the data being processed, defaults to # generates CoNLL-RDF calls for reading and writing different dialects TODO: reads one-token-per-line TSV ("CoNLL") data from stdin, transforms to output format according to the conll:Dialect mapping defined in OWL NOTE that we generate Bash scripts at the moment, but that escaping (of SPARQL scripts) and paths (classpath, package) need to be adjusted in order to execute it CoNLL-RDF JSON configs soon to come. loading CoNLL-RDF ontology conll-rdf/owl/conll.ttl log4j:WARN No appenders could be found for logger (Jena). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. building bash script

  1. configure preprocessing

  2. configure extraction

  3. configure update approximative specialization: FORM => WORD generalization: XPOS => POS warning: no mapping for target format property DOCUMENT_ID warning: no mapping for target format property PART_NUMBER warning: no mapping for target format property PARSE warning: no mapping for target format property PRED_LEMMA warning: no mapping for target format property PRED_FRAMESET warning: no mapping for target format property WORD_SENSE warning: no mapping for target format property SPEAKER warning: no mapping for target format property NER warning: no mapping for target format property arguments warning: no mapping for target format property COREF warning: no mapping for source format property LEMMA warning: no mapping for source format property UPOS warning: no mapping for source format property FEATS warning: no mapping for source format property HEAD warning: no mapping for source format property EDGE warning: no mapping for source format property DEPS warning: no mapping for source format property MISC {6=bracketEncoding}

  4. configure formatter

  5. configure postprocessing

  6. writing script

    !/bin/bash

TRANSFORM=$(dirname $0) echo "warning: set TRANSFORM the CoNLL-Transform root directory" 1>&2

CONLL_RDF=$TRANSFORM/conll-rdf echo "warning: set CONLL_RDF to your local CoNLL-RDF installation!" 1>&2 echo "or get it from https://github.com/acoli-repo/conll-rdf" 1>&2

$CONLL_RDF/run.sh CoNLLStreamExtractor "#" ID FORM LEMMA UPOS XPOS FEATS HEAD EDGE DEPS MISC | \ $CONLL_RDF/run.sh CoNLLRDFUpdater -custom -updates "PREFIX conll: http://ufal.mff.cuni.cz/conll2009-st/task-description.html# INSERT { ?a conll:WORD ?b } WHERE { ?a conll:FORM ?b}; INSERT { ?a conll:POS ?b } WHERE { ?a conll:XPOS ?b}; " | \ $CONLL_RDF/run.sh CoNLLRDFFormatter -conll DOCUMENT_ID PART_NUMBER ID WORD POS PARSE PRED_LEMMA PRED_FRAMESET WORD_SENSE SPEAKER NER arguments COREF