Closed Zhuxitong closed 2 years ago
@Zhuxitong
You can still perform pairwise comparisons between B-C and get all the syntenic genes, in .anchors
file but it is just a little difficult to add them in the .blocks
file since it is a little tricky to order the pairs without a common column. You can manually build your .blocks
file by combining A-B-C and B-C in the same file.
a1 b1 c1
a2 b2 c2
. b3 c3
If all you want is to plot the synteny based on the .blocks
file, you don't have to worry about the sorting of the rows, and you can just combine the pairwise .anchors
that way, and not missing any signal that you are afraid of losing.
The cited text is the same underlying method used in the jcvi
package - single linkage clustering, N genes apart, and min chain length. jcvi
, MCScanX, and DAGchainer are the same family of methods, they differ in their specific criteria in calling a block. However, all blocks are intrinsically "pairwise" and you need extra efforts to combine them into the same blocks.
Hi haibao,
Really thanks for your timely reply. I may not implement the single-linkage clustering methods as it is still difficult to me. The way that you mentioned I need to add B-C into .blocks manually will work I think. But for my 8 species, this may takes more ime. So I am thinking if I could select different references once a time (like A in A-B, A-C, and then B in B-A, B-C) and join the anchors for each reference. At last, I shall merge all anchors and sort and remove redundancy like so:
For reference A a1 b1 c1 a2 b2 c2 For reference B b1 a1 c1 b2 a2 c2 b3 c3 . For reference C c1 a1 b1 c2 a2 b2 c3 b3 . c4 . . Then merge them a1 b1 c1 a2 b2 c2 b1 a1 c1 b2 a2 c2 b3 c3 . c1 a1 b1 c2 a2 b2 c3 b3 . c4 . . Remove redundancy as indicated in italic a1 b1 c1 a2 b2 c2 b3 c3 . c4 . .
I guess through this way I could get all collinear ortholog genes, but at the same time losing the block informations. It also seems this won't influence the plotting of synteny.
If there are any errors, I am very appreciated if you can point them out.
Hello, haibao
I am now working on 8 species and they are very close to each other. I know that I could first do pairwise comparison and then select one reference (like specie A in A-B and A-C) to combine all collinear genes with the help of jcvi.formats.base join command. However, this will largely rely on the reference selected and collinear genes that are absent on A but present on B and C will miss in the final result.
So I am thinking if there is a way to combine all collinear genes based on the pairwise comparison. A paper named "Genomes of 13 domesticated and wild rice relatives highlight genetic conservation, turnover and innovation across the genus Oryza." pointed out one way in their method:
Pairwise comparison can be done with MCScanX in replace of DAGchainer, but I am not sure what the mean is of the last sentence. Since in your doc you also mentioned single-linkage clustering, I am wondering if you have any ideas?
Any suggestions is very appreciated as it has disturbed me a lot.