Closed ianmilligan1 closed 8 years ago
I will do the first batch, save the Labour collection that is currently ingesting (will do so when it's settled).
ALBERTA_coll
= University of Alberta
TORONTO_coll
= University of Toronto
WAHR_coll
= Our research group, internally generated data
i.e. TORONTO_Canadian_Labour Unions
OK! That was some fun times. I also cleaned up all the derivative folders (i.e. /data/derivatives/links/
so that all outputs also follow this naming convention. All PART files were discarded, just the combined data files.
A random piece of code that I've used, just so I don't forget it elsewhere:
for f in *; do mv "$f" "ALBERTA_$f"; done
Another script for later purposes. It removes [text] from the files in the folder files (similar prefixes cause problems when I truncate the filenames).
$extra = [text]
for filename in *.fasta; do [ -f "$filename" ] || continue mv $filename ${filename//$extra/}
done
Now that we've got multiple institutions that we're ingesting into WALK, we should rename the directories in
data
to reflect the original institution.