This set of gff contains 86330 predicted coding regions in total, but I could find only 84740 predicted coding regions in gene_presence_absence.csv file.
Then, I checked the ffn files of missing 1590 regions, and the length was longer than 120 bp and contained no Ns. What could be the reason why I missed 1590 coding regions?
Hello from Japan. Now I'm analyzing 38 gff files of specific species of bacteria. Here is my code for running roary.
${SINGULARITY} exec --cleanenv ${ROARY_SIF} roary -p ${THREADS} -e --mafft -i 95 -f ${out_dir} ${gff_dir}/*.gff
This set of gff contains 86330 predicted coding regions in total, but I could find only 84740 predicted coding regions in gene_presence_absence.csv file.
Then, I checked the ffn files of missing 1590 regions, and the length was longer than 120 bp and contained no Ns. What could be the reason why I missed 1590 coding regions?