Open apetkau opened 1 year ago
Example implementation is at https://github.com/apetkau/nf-core-queryprofiles
This can be run directly from GitHub if you have Nextflow and Docker installed by:
nextflow run https://github.com/apetkau/nf-core-queryprofiles -profile docker,test -r dev --outdir results
1. Purpose
The purpose of this pipeline is to query for genomes within a certain threshold of a collection of genomes.
2. Input
2.1. Query profiles
The input will consist of cg/wgMLST profiles for queries and a reference selection/scope of this query. This will be passed via the
--input
parameter and will look like the following:querysheet.csv:
2.1.1. Allele profiles (CSV)
The following example format will be used for the allele profiles for the CSV format (both uncompressed and gzipped files will be supported).
3. Steps
3.1. Perform query
For each listed query, this will search for genomes within a particular threshold. This will use https://github.com/phac-nml/profile_dists.
4. Output
The following output will be provided. This will be communicated with an
output.json
file with the following larger structure: