tanghaibao / jcvi

Python library to facilitate genome assembly, annotation, and comparative genomics
BSD 2-Clause "Simplified" License
745 stars 187 forks source link

Karyotype plot with my own orthologues? #224

Closed pgonzale60 closed 4 years ago

pgonzale60 commented 4 years ago

Dear Haibao,

Thanks for this amazing software!

I wonder if I could use the result of orthofinder 2 with the annotation bedfiles together with MCscan. I'm studying nematode genomes and they diverge very rapidly. I would like to use the result of orthofinder because it identified 5455 single copy orthologs between the species I'm interested in assessing. In contrast, LAST yielded only 353 filtered anchors when using the equivalent input. Last gave me 1145 filtered anchors once I limited my gene space to that of the orthologs identified by orthofinder.

More generally the question is whether I can provide MCscan with a file indicating which gene in species A corresponds to which genes in species B and make a karyotype plot with this and the respective coordinates.

Best regards, Pablo

tanghaibao commented 4 years ago

@pgonzale60

If I understand correctly, you want to use your own BLAST input. Can you try to make a fake BLAST file with 12 columns? I think the important fields are in the first two columns, namely, query and subject. Then you can directly run the clustering algorithm to get the .anchors file.

$ python -m jcvi.compara.synteny scan A.B.blast A.B.anchors

Once you have the .anchors file, proceed with the karyotype plotting as detailed in the tutorial.

I have not tried this strategy myself, so there is a chance that it may not work.

The other thing that you can try is to run BLASTP or DIAMOND using protein sequences instead. OrthoFinder is more sensitive here since it uses protein sequences (with DIAMOND, I think).

Haibao

pgonzale60 commented 4 years ago

Thank you very much for the guidance. It increased the number of anchor to 1909. However, I'm putting in 5381 ortholog pairs. Is there a way I can increase the numbers of anchors up to 5381? Why is it that not all "BLAST" results were converted into anchors?

Thanks in advance, Pablo

tanghaibao commented 4 years ago

@pgonzale60

The intended use case for karyotype plotter is to plot syntenic blocks, not single BLAST pairs. However, if you are interested in viewing more data, decrease the minimum block size to 1 so that would be close to viewing everything.

$ python -m jcvi.compara.synteny scan A.B.blast A.B.anchors --min_size=1

This means blocks with one anchor would still be generated in the .anchors file.

Haibao