shuoyangd / tape4nmt

a ducttape workflow for neural machine translation
14 stars 6 forks source link

removed merging, dummy steps in favor of file prefixes #10

Open mjpost opened 5 years ago

mjpost commented 5 years ago

This simplifies data processing (#9) by dummy steps and moving file merging to the download_or_link stage. You now specify file sources as prefixes; when downloading or linking, the language extensions are appended. As a result XML is no longer supported, but I think that's a simple thing to do outside the script. We could also extend download_or_link to support XML.