harry-thorpe / piggy

Pipeline for analysing intergenic regions in bacteria
GNU General Public License v3.0
37 stars 7 forks source link

exclusion of IGRs < 30 bp #18

Closed maesaar closed 6 years ago

maesaar commented 7 years ago

Could you please explain the reason behind exclusion of IGRs < 30bp?

Please lable this as question!

harry-thorpe commented 7 years ago

I mainly did this because otherwise really short IGRs would cluster together by chance. Also, these short IGRs are likely to be spacers between very close genes, and are unlikely to carry regulatory elements.

Perhaps I should add an option to set min and max sizes?

maesaar commented 7 years ago

Thanks for explanation - min & max sizes could be useful.

maesaar commented 7 years ago

One more thing - Did you test minimal sample size to work with Piggy, which gives reasonable results? I tested different sample sets with different number (n=30; n=50; n=200; n=400; n=600) of isolatest. High number of isolates gave similar expected results as in the publication, but when I used small no of samples (n=30) I got results that differed alot. I was wondering if Piggy results can be influenced by small sample size?

harry-thorpe commented 7 years ago

Yes the results will vary according to sample size. How did they vary with a small sample?

harry-thorpe commented 6 years ago

I have now added an option --size|-s which enables you to choose the min and max IGR sizes.