beckyjackson / synbio-alignments

1 stars 1 forks source link

make prepare isn't rsync-ing ANI data #3

Open turbomam opened 3 years ago

turbomam commented 3 years ago

Via make prepare:

python3 src/pick_genomes.py data/ANI build/reference_genomes.csv

Traceback (most recent call last): File "/Users/MAM/Documents/gitrepos/synbio-alignments/src/pick_genomes.py", line 40, in main() File "/Users/MAM/Documents/gitrepos/synbio-alignments/src/pick_genomes.py", line 14, in main for f in os.listdir(args.ani_directory): FileNotFoundError: [Errno 2] No such file or directory: 'data/ANI' make: *** [build/reference_genomes.csv] Error 1

turbomam commented 3 years ago

There is at least one tabular ANI output fiel in te right location:

mam@merlot:~$ head /data/iarpa/qc/pyani/452/pyani_out/ANIm_percentage_identity.tab

unicycler_assembly wtdbg2_assembly AE004091.2 CP003734 GCA_001618925.1_ASM161892v1 GCA_001729505.1_ASM172950v1 GCF_000006765.1_ASM676v1 GCF_000611975.2_PA103_1.0_genomic unicycler_assembly 1.0 0.9948701750309294 0.9895377849814238 0.8399877701318301 0.9891940735562609 0.989523642203766 0.9895377849814238 0.9999180343382472 wtdbg2_assembly 0.9948701750309294 1.0 0.9843084716543441 0.839901254364958 0.9840791743968509 0.9842860766117674 0.9843084716543441 0.9950365363773142 AE004091.2 0.9895377849814238 0.9843084716543441 1.0 0.8396774810242793 0.9942519176515338 0.999967122285087 1.0 0.9895551308882619 CP003734 0.8399877701318301 0.839901254364958 0.8396774810242793 1.0 0.8396518746987358 0.8396125126050243 0.8396055398448263 0.8397367694356909 GCA_001618925.1_ASM161892v1 0.9891940735562609 0.9840791743968509 0.9942519176515338 0.8396518746987358 1.0 0.9942379818010888 0.9942510243765236 0.9891738382016315 GCA_001729505.1_ASM172950v1 0.989523642203766 0.9842860766117674 0.999967122285087 0.8396125126050243 0.9942379818010888 1.0 0.9999653261063861 0.989546784845441 GCF_000006765.1_ASM676v1 0.9895377849814238 0.9843084716543441 1.0 0.8396055398448263 0.9942510243765236 0.9999653261063861 1.0 0.9895551308882619 GCF_000611975.2_PA103_1.0_genomic 0.9999180343382472 0.9950365363773142 0.9895551308882619 0.8397367694356909 0.9891738382016315 0.989546784845441 0.9895551308882619 1.0

turbomam commented 3 years ago

tab-completion on make doesn't include mirror_ANIs

(_venv) MAM@MAM-M74 synbio-alignments % make mirror_

mirror_genomes mirror_seqs mirror_unicycler

turbomam commented 3 years ago

make prepare doesn't seem to be invoking its mirror_ANIs dependency

(_venv) MAM@MAM-M74 synbio-alignments % make prepare

sshpass -p rsync -avRL mam@merlot.lbl.gov:/data/iarpa/TE/*/assembly/unicycler_assembly.fasta . receiving file list ... done

sent 16 bytes received 14519 bytes 5814.00 bytes/sec total size is 1372296086 speedup is 94413.22 sshpass -p rsync -avR mam@merlot.lbl.gov:/data/iarpa/TE//target_sequence/IF.fasta . receiving file list ... done

sent 16 bytes received 19060 bytes 5450.29 bytes/sec total size is 303289 speedup is 15.90 python3 src/pick_genomes.py data/ANI build/reference_genomes.csv Traceback (most recent call last): File "/Users/MAM/Documents/gitrepos/synbio-alignments/src/pick_genomes.py", line 40, in main() File "/Users/MAM/Documents/gitrepos/synbio-alignments/src/pick_genomes.py", line 14, in main for f in os.listdir(args.ani_directory): FileNotFoundError: [Errno 2] No such file or directory: 'data/ANI' make: *** [build/reference_genomes.csv] Error 1 (_venv) MAM@MAM-M74 synbio-alignments %

turbomam commented 3 years ago

The mirror_ANIs step in the Makefile should be change from

# 1. Download ANIm percentage identity from LBL (/data/iarpa/qc/pyani/[SAMPLE]/pyani_out/ANIm_percentage_identity.tab)
.PHONY: mirror_ANIs
    sshpass -p ${LBL_PASSWORD} rsync -avR ${LBL_USER}@merlot.lbl.gov:/data/iarpa/qc/pyani/*/pyani_out/ANIm_percentage_identity.tab .

to

# 1. Download ANIm percentage identity from LBL (/data/iarpa/qc/pyani/[SAMPLE]/pyani_out/ANIm_percentage_identity.tab)
.PHONY: mirror_ANIs
mirror_ANIs:
    sshpass -p ${LBL_PASSWORD} rsync -avR ${LBL_USER}@merlot.lbl.gov:/data/iarpa/qc/pyani/*/pyani_out/ANIm_percentage_identity.tab .

I made the change locally and will eventually make a fork and PR it.

turbomam commented 3 years ago

Also, the prepare rule should create data/ANI?

prepare: data/ANI mirror_ANIs mirror_unicycler mirror_seqs mirror_genomes