AdmiralenOla / Scoary

Pan-genome wide association studies
GNU General Public License v3.0
147 stars 35 forks source link

GenomeIDs not matching custom tree and gene pres/abs #68

Closed moorembioinfo closed 5 years ago

moorembioinfo commented 5 years ago

Hi!

This is probably my issue but I've noticed a few things that are a bit confusing.

I'm running scoary for 535 genomes and have made sure that all genomeIDs in my custom newick file match the genomeIDs in the gene presence/absence file but scoary reports that they don't match?

Reading custom tree file CRITICAL: Traceback (most recent call last): File "/Users/matt/bin/miniconda3/lib/python3.6/site-packages/scoary/methods.py", line 246, in main sys.exit("CRITICAL: Please make sure that isolates in " SystemExit: CRITICAL: Please make sure that isolates in your custom tree match those in your gene presence absence file. CRITICAL: Please make sure that isolates in your custom tree match those in your gene presence absence file.

I checked that the IDs do match in a number of ways, one of which was to run without and have scoary output its own newick file. I noticed that scoary outputs the 'inference' column header as a leaf on the tree (so +1 leaves). I deleted this column and then scoary runs with my custom tree with no issues but for n = 534 genomes?

Thanks in advance for any help with this!

AdmiralenOla commented 5 years ago

Hello, and thanks for reporting!

Sounds like an issue where Scoary is having trouble determining the structure of your gene presence/absence file, specifically where in the file the actual strain-wise presence/absence data starts. (As opposed to the summary data in the first columns). By default, Scoary thinks this data starts at column 15 (if the counting starts at 1), as per the traditional Roary standard. You can control this behavior using -s. For example, if your data starts at column 16, run with -s 16.

I hope this solves your issue. If not, please get back to me.

AdmiralenOla commented 5 years ago

Closing since this does not appear to be a bug.