SunPengChuan / wgdi

WGDI: A user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes
https://wgdi.readthedocs.io/en/latest/
BSD 2-Clause "Simplified" License
114 stars 22 forks source link

Error in creating dotplot, ValueError: cannot reindex on an axis with duplicate labels #41

Closed salmanento closed 8 months ago

salmanento commented 9 months ago

I am trying to create dotplot using two genomes for whole genome duplication detection. I am facing issue (ValueError: cannot reindex on an axis with duplicate labels) screenshot of error attached. I have also attached the Screenshot of GFF1, GFF2, Lens1 & Lens2 & Blast files used. Please look into it and guide about the issue and make me able to proceed with the analysis.

Below is the total.conf script [dotplot] blast = /home/lilin/wgd-exp/dotplot/updated/blastp_output.blast gff1 =/home/lilin/wgd-exp/dotplot/updated/X.ripnewide_gene.gff gff2 = /home/lilin/wgd-exp/dotplot/updated/S. gregaria_updatedgene.gff lens1 = /home/lilin/wgd-exp/dotplot/updated/lens1_X.riparia.lens lens2 =/home/lilin/wgd-exp/dotplot/updated/lens2_S.gregaria.lens genome1_name = Xya_riparia genome2_name = Schistocerca_gregaria multiple = 1 score = 100 evalue = 1e-5 repeat_number = 10 position = order blast_reverse = false ancestor_left = none ancestor_top = none markersize = 0.5 figsize = 10,10 savefig = /home/lilin/wgd-exp/dotplot/updated/dotplot.png

Attachments Error GFF1 GFF2 lens1 lens2 BlAST

SunPengChuan commented 9 months ago

There seems to be a problem when handling gff files, specifically regarding the alignment between the initial two columns in the blast file and the second column within the gff file.

salmanento commented 9 months ago

Thank you, resolved. Dotplot created, Please check it is Ok? Can you please help me how to interpret the dotplot results?

savefile

SunPengChuan commented 9 months ago

Your results show no significant collinearity between the two species. Could their divergence be too great? Consider using closer species to check if it’s a data processing issue.

salmanento commented 9 months ago

What about these results now?

snt_sgr dotplot ksblock kspeak peakfit align ksfigure

SunPengChuan commented 9 months ago

The reason why the second figure doesn’t have clear synteny blocks is leading to incorrect results in your subsequent analysis.

salmanento commented 9 months ago

Now, I have used two closely related species, belonging from same genus. Look into the results please.

dotplot blockKs kspeaks peaksfit Ksfigure align