sanger-pathogens / Roary

Rapid large-scale prokaryote pan genome analysis
http://sanger-pathogens.github.io/Roary
Other
302 stars 189 forks source link

I couldn't find specific predicted coding regions in gene_presence_absence.csv. #574

Open huminfo8 opened 2 years ago

huminfo8 commented 2 years ago

Hello from Japan. Now I'm analyzing 38 gff files of specific species of bacteria. Here is my code for running roary.

${SINGULARITY} exec --cleanenv ${ROARY_SIF} roary -p ${THREADS} -e --mafft -i 95 -f ${out_dir} ${gff_dir}/*.gff

This set of gff contains 86330 predicted coding regions in total, but I could find only 84740 predicted coding regions in gene_presence_absence.csv file.

Then, I checked the ffn files of missing 1590 regions, and the length was longer than 120 bp and contained no Ns. What could be the reason why I missed 1590 coding regions?