Closed garrison-chen closed 10 months ago
Hey @garrison-chen sorry for the late reply! It's hard for me to find time looking into this POCP pipeline but let me see what I can do regarding 1-vs-all comparison instead of all-vs-all
Hey,
I added the one-vs-all mode now. It is automatically activated when you specify a specific genome (--genome
) or protein FASTA (--protein
) in addition to the default input --genomes
or --proteins
. If you do that, the comparisons will only be made between the additional genome/protein FASTA vs all others.
For example:
nextflow pull hoelzer/pocp
# Currently the changes are in the branch "one-vs-all" for testing
nextflow run hoelzer/pocp -r one-vs-all --genomes 'example/*.fasta' --genome example/Cav_10DC88.fasta -profile local,docker
will give you
❯ cat results/pop-matrix.tsv
ID Cav_10DC88 Cav_11DC096 Cga_08-1274-3 Cga_12-4358 Ctr_A-HAR-13
Cav_10DC88 100.0 98.9172 96.5928 96.4865 83.171
Cav_11DC096 98.9172 100.0 0.0 0.0 0.0
Cga_08-1274-3 96.5928 0.0 100.0 0.0 0.0
Cga_12-4358 96.4865 0.0 0.0 100.0 0.0
Ctr_A-HAR-13 83.171 0.0 0.0 0.0 100.0
Or if I switch the "target genome"
nextflow run hoelzer/pocp -r one-vs-all --genomes 'example/*.fasta' --genome example/Cga_08-1274-3.fasta -profile local,docker -resume
I will get:
❯ cat results/pop-matrix.tsv
ID Cav_10DC88 Cga_08-1274-3 Cav_11DC096 Cga_12-4358 Ctr_A-HAR-13
Cav_10DC88 100.0 96.5928 0.0 0.0 0.0
Cga_08-1274-3 96.5928 100.0 97.1207 99.8894 83.9513
Cav_11DC096 0.0 97.1207 100.0 0.0 0.0
Cga_12-4358 0.0 99.8894 0.0 100.0 0.0
Ctr_A-HAR-13 0.0 83.9513 0.0 0.0 100.0
It also works if you directly give protein FASTAs as input, skipping the annotation step:
nextflow run hoelzer/pocp -r one-vs-all --proteins 'example/*.faa' --protein example/Cga_08-1274-3.faa -profile local,docker -resume
❯ cat results/pocp-matrix.tsv
ID Cav_10DC88 Cga_08-1274-3 Cmu_Nigg Cps_6BC Ctr_D-UW-3-CX
Cav_10DC88 100.0 96.7532 0.0 0.0 0.0
Cga_08-1274-3 96.7532 100.0 83.5196 90.0476 84.1402
Cmu_Nigg 0.0 83.5196 100.0 0.0 0.0
Cps_6BC 0.0 90.0476 0.0 100.0 0.0
Ctr_D-UW-3-CX 0.0 84.1402 0.0 0.0 100.0
Finally, here is a mixed command using a set of genomes as input and comparing them one-vs-all against a given protein multi-FASTA:
nextflow run hoelzer/pocp -r one-vs-all --genomes 'example/*.fasta' --protein example/Cga_08-1274-3.faa -profile local,docker -resume
❯ cat results/pocp-matrix.tsv
ID Cav_10DC88 Cga_08-1274-3 Cav_11DC096 Cga_12-4358 Ctr_A-HAR-13
Cav_10DC88 100.0 96.5405 0.0 0.0 0.0
Cga_08-1274-3 96.5405 100.0 97.067 99.8895 83.9049
Cav_11DC096 0.0 97.067 100.0 0.0 0.0
Cga_12-4358 0.0 99.8895 0.0 100.0 0.0
Ctr_A-HAR-13 0.0 83.9049 0.0 0.0 100.0
Can you test it please, @garrison-chen ?
If everything works, I will merge that into the main
branch and do another release.
Cheers, Martin
Hi Martin,
Thanks a lot for the follow-up! I have tested it and so far everything works from my side. It's an amazing tool!
Best, Chen
Great, happy to hear that! Thanks!
Thanks for the great tool! I've been using it for a while. I want to ask if the nextflow implementation also supports 1-vs-all calculation as in release 1.1.1 (previous ruby implementation before nextflow)? If yes, how should we call the program? Many thanks!
Best, Chen