gtonkinhill / panaroo

An updated pipeline for pangenome investigation
MIT License
266 stars 33 forks source link

problem with expand_no #83

Open samlipworth opened 4 years ago

samlipworth commented 4 years ago

My understanding of expan_no in panatoo-gene-neighbourhood is that it should give N genes up and downstream of the target gene X. In my version of panaroo (from conda - version 1.2.3) - this works for --expand_no 1 but not for --expand_no 2 where some have many more genes than 2 up or downstream

Do you have any idea why this might be?

gtonkinhill commented 4 years ago

Hi,

Yes, you're correct. This sounds like it might be a bug. I will try and take a look this week.

samlipworth commented 4 years ago

Thanks very much - I can provide a sample dataset for diagnostics if helpful.

samlipworth commented 4 years ago

The problem appears to happen on one side of the gene - so the script behaves as expected to the left of the gene but not the right:

e.g. group_4691,orf05,group_3988,yjcD,group_4950,aac6IIc~aac61bcr~~aacA4,blaOXA1,cat_1,group_5079,group_3814,group_1997,group_907,group_4665,group_2500,group_6706,group_7962,merB,group_8452,group_9561,group_6290,group_2197,group_7483,group_1416,group_323,group_9396,group_8617,group_9956,group_4839,sul2,strA~aph3Ib,group_4838,group_5062,group_8245,group_4951,group_2686,group_7436,yghA_1,group_10754,group_262,group_1432,group_2308,yghA~yghA_2~yghA_3,tnpA_2

where expand_no -= 5 and gene = aac6IIc~aac61bcr~~~~aacA4

gtonkinhill commented 4 years ago

Hi,

Apologies, this took me a bit longer to get to than expected. I am having trouble reproducing the issue.

It would be great if you could provide the sample dataset that leads to the problem.