Permanent failure with VU recepie

hadiasheri commented 6 years ago

Hi, I'm trying to run the code with VU DNC dataset. The link you provided didn't work and I downloaded it from here. Now, when I run the vudnc-preprocess.cwl as follows:

in_dir="/home/dataset/VU/FoLiACMDI" ocr_dir_name="/home/dataset/VU/Preprocess/ocr" gs_dir_name="/home/dataset/VU/Preprocess/gs" aligned_dir_name="/home/dataset/VU/Preprocess/aligned" tmp_dir="/home/ochre/vu-tmp/" tmp_dir_out="/home/ochre/vu-tmp-out/" cachedir="/home/ochre/cachedir/" align_m="align_m.csv" align_c="align_c.csv" ocr_n="ocr_n.csv" gs_n="gs_n.csv"

cwltool |cwl-runner ochre/cwl/vudnc-preprocess.cwl --in_dir $in_dir --ocr_dir_name $ocr_dir_name --gs_dir_name $gs_dir_name --aligned_dir_name $aligned_dir_name --ocr_n $ocr_n --gs_n $gs_n --align_m $align_m --align_c $align_c

Howerver, it is permanently failed with the following message:

[step merge-json] Cannot make job: Value for file:///home/ochre/ochre/cwl/align-texts-wf.cwl#merge-json/in_files not specified

[workflow align-texts-wf] completed permanentFail

I'd be grateful if you could help to figure out the problem. Thanks H

jvdzwaan commented 6 years ago

I think the workflow fails because of changes to nlppln. I'll try to see if I can fix that later.

Alo, I really recommend to use a different dataset than the vudnc corpus. It is just to noisy. Here is a poster that shows the most common error is a hyphenation error ('- ' that should be replaced with '', that is just too easy): https://doi.org/10.5281/zenodo.1189245

jvdzwaan commented 6 years ago

Okay, it should work again. Be careful to read the updated documentation in the README. Also, don't forget to update nlppln.

For future reference, this is the relevant commit: 9ee6d7cca72bb9bcd074e1843b12ceea122662ce

KBNLresearch / ochre

Permanent failure with VU recepie #12