tilde-nlp / c-eval

Parallel corpora cleaning and evaluation tool
http://www.eamt2015.org/files/downloads/EAMT2015_Proceedings.pdf#page=200
4 stars 0 forks source link

tool not working #1

Open Gldkslfmsd opened 6 years ago

Gldkslfmsd commented 6 years ago

Hello,

is this tool still working? I would like to review it, but I have following problems:

$ pypy cleaner.py -train -s en -t pl -a fastalign -c reptree -m cleaner.model
usage: cleaner.py [-h] [-train TRAIN] [-a {giza,fastalign}] [-x ALIGNER_ARGS]
                  [-s SOURCE_FILE] [-t TARGET_FILE] [-st SOURCE_TARGET_FILE]
                  [-ts TARGET_SOURCE_FILE] [-m MODEL] [-f FEATURES_FILE]
                  [-o OUTPUT_FILE] [-p PRECISION] [-n LINES] [-d DELIMITER]
                  [-c CLASSIFIER] [-e EVALUATION] [-l LOG_FILE] [-k]
                  [--giza-keep-output [GIZA_KEEP_OUTPUT]]
                  [--giza-keep-cfg [GIZA_KEEP_CFG]]
                  [--fastalign-keep-input [FASTALIGN_KEEP_INPUT]]
                  [--fastalign-keep-table [FASTALIGN_KEEP_TABLE]]
                  [-fas FAST_ALIGN_TABLE_SOURCE]
                  [-fat FAST_ALIGN_TABLE_TARGET]
cleaner.py: error: argument -train: expected one argument
$ pypy cleaner.py -train True -s en -t pl -a fastalign -c reptree -m cleaner.model
Running pypy /lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py -a fastalign -s en -t pl -st cleaner-train.src-trg.good.alignments-en --giza-keep-output --fastalign-keep-table
/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/linux/fastalign/x64/fast_align -d -o -v -c cleaner-train.src-trg.good.alignments-en.fastalign.table -i cleaner-train.src-trg.good.alignments-en.combined

Traceback (most recent call last):
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 205, in <module>
    keep_table=args.fastalign_keep_table)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 45, in fastalign
    run(path('fastalign', 'fast_align'), args, fout, log_file, log_file)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 10, in run
    proc = subprocess.Popen([cmd] + args, stdout=fout, stderr=ferr)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/pypy2-v6.0.0-linux64/lib-python/2.7/subprocess.py", line 405, in __init__
    errread, errwrite)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/pypy2-v6.0.0-linux64/lib-python/2.7/subprocess.py", line 1053, in _execute_child
    raise child_exception
OSError: [Errno 13] Permission denied
Running pypy /lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py -a fastalign -s en -t pl -ts cleaner-train.trg-src.good.alignments-pl --giza-keep-output --fastalign-keep-table
/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/linux/fastalign/x64/fast_align -d -o -v -c cleaner-train.trg-src.good.alignments-pl.fastalign.table -i cleaner-train.trg-src.good.alignments-pl.combined

Traceback (most recent call last):
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 210, in <module>
    keep_table=args.fastalign_keep_table)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 45, in fastalign
    run(path('fastalign', 'fast_align'), args, fout, log_file, log_file)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 10, in run
    proc = subprocess.Popen([cmd] + args, stdout=fout, stderr=ferr)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/pypy2-v6.0.0-linux64/lib-python/2.7/subprocess.py", line 405, in __init__
    errread, errwrite)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/pypy2-v6.0.0-linux64/lib-python/2.7/subprocess.py", line 1053, in _execute_child
    raise child_exception
OSError: [Errno 13] Permission denied
Running pypy /lnet/spec/work/people/machacek/neural-interlingua/c-eval/shuffle.py -i pl -o cleaner-train.trg.bad-pl
Running pypy /lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py -a fastalign -s en -t cleaner-train.trg.bad-pl -st cleaner-train.src-trg.bad.alignments-en --giza-keep-output --fastalign-keep-table
/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/linux/fastalign/x64/fast_align -d -o -v -c cleaner-train.src-trg.bad.alignments-en.fastalign.table -i cleaner-train.src-trg.bad.alignments-en.combined

Traceback (most recent call last):
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 205, in <module>
    keep_table=args.fastalign_keep_table)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 45, in fastalign
    run(path('fastalign', 'fast_align'), args, fout, log_file, log_file)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 10, in run
    proc = subprocess.Popen([cmd] + args, stdout=fout, stderr=ferr)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/pypy2-v6.0.0-linux64/lib-python/2.7/subprocess.py", line 405, in __init__
    errread, errwrite)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/pypy2-v6.0.0-linux64/lib-python/2.7/subprocess.py", line 1053, in _execute_child
    raise child_exception
OSError: [Errno 13] Permission denied
Running pypy /lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py -a fastalign -s en -t cleaner-train.trg.bad-pl -ts cleaner-train.trg-src.bad.alignments-pl --giza-keep-output --fastalign-keep-table
/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/linux/fastalign/x64/fast_align -d -o -v -c cleaner-train.trg-src.bad.alignments-pl.fastalign.table -i cleaner-train.trg-src.bad.alignments-pl.combined

Traceback (most recent call last):
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 210, in <module>
    keep_table=args.fastalign_keep_table)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 45, in fastalign
    run(path('fastalign', 'fast_align'), args, fout, log_file, log_file)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/aligner.py", line 10, in run
    proc = subprocess.Popen([cmd] + args, stdout=fout, stderr=ferr)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/pypy2-v6.0.0-linux64/lib-python/2.7/subprocess.py", line 405, in __init__
    errread, errwrite)
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/pypy2-v6.0.0-linux64/lib-python/2.7/subprocess.py", line 1053, in _execute_child
    raise child_exception
OSError: [Errno 13] Permission denied
Running pypy /lnet/spec/work/people/machacek/neural-interlingua/c-eval/features.py -a fastalign -s en -t pl -st cleaner-train.src-trg.good.alignments-en -ts cleaner-train.trg-src.good.alignments-pl -fas cleaner-train.src-trg.good.alignments-en.fastalign.table -fat cleaner-train.trg-src.good.alignments-pl.fastalign.table -o cleaner-train.features.good.txt -c good -p 8
Done, 0 lines processed
Running pypy /lnet/spec/work/people/machacek/neural-interlingua/c-eval/features.py -a fastalign -s en -t cleaner-train.trg.bad-pl -st cleaner-train.src-trg.bad.alignments-en -fas cleaner-train.src-trg.bad.alignments-en.fastalign.table -fat cleaner-train.trg-src.bad.alignments-pl.fastalign.table -ts cleaner-train.trg-src.bad.alignments-pl -o cleaner-train.features.bad.txt -c bad -p 8
Running target pre-features
Running pypy /lnet/spec/work/people/machacek/neural-interlingua/c-eval/pre-features.py -s cleaner-train.trg.bad-pl -t en -als cleaner-train.trg-src.bad.alignments-pl -fa cleaner-train.trg-src.bad.alignments-pl.fastalign.table -o cleaner-train.trg.bad-pl
Traceback (most recent call last):
  File "/lnet/spec/work/people/machacek/neural-interlingua/c-eval/pre-features.py", line 31, in <module>
    table = open(args.fast_align_table, 'r')
IOError: [Errno 2] No such file or directory: 'cleaner-train.trg-src.bad.alignments-pl.fastalign.table'
Done, 0 lines processed
Running java -jar /lnet/spec/work/people/machacek/neural-interlingua/c-eval/classifier.jar -train -c reptree -m cleaner.model -f cleaner-train.features-en.pl
Exception in thread "main" weka.core.WekaException: weka.classifiers.trees.REPTree: Not enough training instances with class labels (required: 1, provided: 0)!
    at weka.core.Capabilities.test(Capabilities.java:1163)
    at weka.core.Capabilities.test(Capabilities.java:1045)
    at weka.core.Capabilities.testWithFail(Capabilities.java:1356)
    at weka.classifiers.trees.REPTree.buildClassifier(REPTree.java:1869)
    at com.tilde.corpora.cleaner.Main.main(Main.java:50)
pdonald commented 6 years ago

What's the output of this command?

ls -l /lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/linux/fastalign/x64/fast_align
Gldkslfmsd commented 6 years ago
$ ls -l /lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/linux/fastalign/x64/fast_align
-rw-r--r-- 1 machacek ufal_ext 1618816 Nov  1 16:13 /lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/linux/fastalign/x64/fast_align
pdonald commented 6 years ago

I believe you're getting Permission denied because the fast_align executable doesn't have exec permissions.

Try

chmod +x /lnet/spec/work/people/machacek/neural-interlingua/c-eval/aligner/linux/fastalign/x64/fast_align
Gldkslfmsd commented 6 years ago

(Without the ls -l.)

Thanks, seems working now. :)