SmartDataInnovationLab / git_batch

create a git-remote for running batch-jobs
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

archive_dir.sh #12

Open bjuergens opened 6 years ago

bjuergens commented 6 years ago
archive_dir.sh <data-dir> <archive-dir> [-t template]

Moves to /<hash and outputs path

example:

archive_dir.sh data /smartdata/my_project/archive | xargs ln -s data

Problems:

pyspark give other outputs than just hash. --> either clean the dirhash so the output is always clean (it is not known if the output comes from pyspark, or from somewhere in dirhash), or parse the output in archive_dir.sh. The former is strongly preferred