Closed ramiroricardo closed 3 years ago
Hi ramiroricardo, Ah very good question. I can't remember if this is described in the publication but I don't think so. I'll add this to the readme later today!
Source : first gene's name
Target: second gene's name
p: p-value (corrected if correction applied) to the interaction
Avg synthetic distance: a blank column of 0s; this is an artifact of when I was trying to use information in the roary output to calculate the average distance between the 2 genes. I will remove this in my next commit.
successes: # of times the 2 genes are found together (or apart if using -d
)
observations: # of genomes 1 or both genes are found in
rate: observed rate of association/dissociation of the genes
expected: expected rate of association/dissociation of the genes
total source: total # of genomes the first gene is observed in
total target: ditto for the second gene
fraction source: total source / total # of genomes in the dataset
fraction target: ditto for total target
Let me know if any of the above isn't clear. I'll leave this issue open until I update the readme with this information and make the appropriate commits.
Thanks for your question! --Fiona
HI Fiona,
Thanks a lot for the quick reply.
All clearto me, except rate
. I understand it is the observed rate, but from looking at the paper I would expect this to be a count, though all the values I have range from 0 to 1. This is either for dissociation or association in a test dataset with ~600 genomes. Is rate standardized in some way?
Thanks
Hi ramiroricardo,
You're correct, I apologize. What is currently output as rate is (using the annotation from the manuscript) EA(ij) = Pi ∗Pj. There was not real reason for this; I'll commit code now to fix this to match the manuscript as not to cause confusion.
Apologies- I misspoke in my initial response: rate: Pi*Pj expected: Pi*Pj*N = EA(ij) successes: Nij = OA(ij)
Outputting rate isn't really necessary or helpful to the user and is confusing, so I'll remove it now.
Hi Fiona,
Thanks a lot for help. This makes things clearer
Hi Fiona,
Thanks for developing such a cool tool.
A simple question, where can I find the exact meaning of each of the columns in the
_pairs.tsv
files?Thanks