Open ctb opened 1 month ago
on farm at /home/ctbrown/scratch/2022-branchwater-benchmarking/wort-list-d.d/Snakefile -
/home/ctbrown/scratch/2022-branchwater-benchmarking/wort-list-d.d/Snakefile
# convert a bunch of .sig files into .sig.zip files and also produce .mf.csv fil es. FILELIST='../data/wort-list-d.txt' siglist = [ x.strip() for x in open(FILELIST) ] print(f"loaded '{len(siglist)}' files") #print('selecting 10...') #siglist = siglist[:10] ACCS = [ os.path.basename(x).split('.')[0] for x in siglist ] rule all: input: expand('{acc}.sig.zip', acc=ACCS), expand('{acc}.mf.csv', acc=ACCS) rule make_sig_zip: output: "{acc}.sig.zip" shell: """ sourmash sig cat /group/ctbrowngrp/irber/data/wort-data/wort-sra/sigs/{w ildcards.acc}.sig -o {output} """ rule make_mf_csv: input: "{acc}.sig.zip", output: "{acc}.mf.csv", shell: """ sourmash sig collect {input} -o {output} -F csv --abspath """
to produce a standalone manifest from the mf.csv files (and also probably from the zip files), do
mf.csv
sourmash sig collect -F csv *.mf.csv -o combined.mf.csv --abspath
on farm at
/home/ctbrown/scratch/2022-branchwater-benchmarking/wort-list-d.d/Snakefile
-