Plant-Food-Research-Open / assemblyqc

A Nextflow pipeline for evaluating assembly quality
https://plant-food-research-open.github.io/assemblyqc/
MIT License
26 stars 2 forks source link

Create a line plot for synteny #74

Closed GallVp closed 3 months ago

GallVp commented 1 year ago

From Ignacio Carvajal

It is apparently possible to run Dgenies locally on linux. I have not played around with it too much because the nice thing about it is the interactivity of the results. I'll experiment running this on powerplant at some point and opening the results on powerplant from a browser. It would be nice to skip downloading on the local machine step which can be a pain, especially for larger geneomes. There are other dotplot visualization packages out there if you run into trouble with dgenies.

I think this is what you mean by filtering of the matches based on identity or length. See here for an example: https://dgenies.toulouse.inra.fr/result/chimp_vs_human

Generating the PAF file that is requried as an input is as easy as: 'minimap2 -t 10 -cx asm10 ${ANITRA_ASSEMBLY} ${WA_ASSEMBLYA}> AN_WA23_asm10.paf'

This is comparing two raspberry assemblies using 10 threads. It takes 150 sec to run. The other inputs for Dgeneies are the two assemblies (${ANITRA_ASSEMBLY} and ${WA_ASSEMBLYA} in this case).

One important thing that I have not tested is the minimap2 parameters. I've been using asm10, which they say is for "intra-species asm-to-asm alignments where the expected divergence is 1%". You can also specify asm5 and asm20 which correspond to .5% and 2% respectively. 1% seems to work well in my cases but it may not be a one size fits all.

Please keep me in the loop for your ideas and if you need help/testing. Keen to contribute.