I am sorry about this question, because I know somebody asked this issue before, and the answer was the checking the directory.
But, I can not run the "query_pan_genome" properly.
I checked the all .gff files are in the same directory, and I used this command "query_pan_genome -a difference --input_set_one 1.gff,2.gff --input_set_two 3.gff,4.gff".
But, continuously I get this error message.
*Error: Cant access the groups file: clustered_proteins
Usage: query_pan_genome [options] .gff
Perform set operations on the pan genome to see the gene differences between groups of isolates.
Options: -g STR groups filename [clustered_proteins]
-a STR action (union/intersection/complement/gene_multifasta/difference) [union]
-c FLOAT percentage of isolates a gene must be in to be core [99]
-o STR output filename [pan_genome_results]
-n STR comma separated list of gene names for use with gene_multifasta action
-i STR comma separated list of filenames, comparison set one
-t STR comma separated list of filenames, comparison set two
-v verbose output to STDOUT
-h this help message
Examples:
Union of genes found in isolates
query_pan_genome -a union *.gff
Intersection of genes found in isolates (core genes)
query_pan_genome -a intersection *.gff
Complement of genes found in isolates (accessory genes)
query_pan_genome -a complement *.gff
Extract the sequence of each gene listed and create multi-FASTA files
query_pan_genome -a gene_multifasta -n gryA,mecA,abc *.gff
Gene differences between sets of isolates
query_pan_genome -a difference --input_set_one 1.gff,2.gff --input_set_two 3.gff,4.gff,5.gff
I had to use -g option to designate current directory. I used this command query_pan_genome -g ./ -a difference --input_set_one 1.gff,2.gff --input_set_two 3.gff,4.gff
I am sorry about this question, because I know somebody asked this issue before, and the answer was the checking the directory.
But, I can not run the "query_pan_genome" properly.
I checked the all .gff files are in the same directory, and I used this command "query_pan_genome -a difference --input_set_one 1.gff,2.gff --input_set_two 3.gff,4.gff".
But, continuously I get this error message.
*Error: Cant access the groups file: clustered_proteins Usage: query_pan_genome [options] .gff Perform set operations on the pan genome to see the gene differences between groups of isolates.
Options: -g STR groups filename [clustered_proteins] -a STR action (union/intersection/complement/gene_multifasta/difference) [union] -c FLOAT percentage of isolates a gene must be in to be core [99] -o STR output filename [pan_genome_results] -n STR comma separated list of gene names for use with gene_multifasta action -i STR comma separated list of filenames, comparison set one -t STR comma separated list of filenames, comparison set two -v verbose output to STDOUT -h this help message
Examples: Union of genes found in isolates query_pan_genome -a union *.gff
Intersection of genes found in isolates (core genes) query_pan_genome -a intersection *.gff
Complement of genes found in isolates (accessory genes) query_pan_genome -a complement *.gff
Extract the sequence of each gene listed and create multi-FASTA files query_pan_genome -a gene_multifasta -n gryA,mecA,abc *.gff
Gene differences between sets of isolates query_pan_genome -a difference --input_set_one 1.gff,2.gff --input_set_two 3.gff,4.gff,5.gff
For further info see: http://sanger-pathogens.github.io/Roary/**
One of my question is that I can not see "--input_set_one" and "--input-set-two" at the Options in the error message.
Is it OK still use "--input_set_one" option?
And, is there someone who knows what is the problem?
My .gff file is OK i guess. Because, I can run roary with the files.