jtlovell / GENESPACE

Other
189 stars 27 forks source link

Removing unwanted chromosomes from riparian plots #89

Closed edexter closed 1 year ago

edexter commented 1 year ago

Is it possible to remove specific contigs from the riparian plots? I know that I can specify which contigs are plotted in respect to the main reference, but I would like to remove a few specific contigs from some of the other genomes which contain assembly errors. Much appreciated.

jtlovell commented 1 year ago

Hi there, There are two situations in which a user may want to remove chromosomes: (1) un-anchored contigs that are too small and muddy the plot (2) miss-assembled chromosomes (e.g. un-purged duplicates, joined bottom-drawer un-anchored contigs)

In the first case, you can drop them using plot_riparian(..., minChrLen2plot = somethingBiggerThanYourLargestTooSmallScaffold, ...).

In the second case, there is no built in way to do this. This is not because it can't be done, but because it can make the results misleading. We want to make sure that the riparian plot represents the genomic relationships accurately.

Furthermore, in general, if your problematic chromosomes represent some sequence that you don't want in your genome then you should start over and only give OrthoFinder the sequences that you want to include. For example, lets say you have some contigs in your primary assembly that are actually meiotic homologs of other primary sequences (and thus should be in the alternative haplotype). While most of your genome is single-copy, these regions will be multi-copy and could cause OrthoFinder to infer HOGs that are not accurate relative to the biology of the system. Hope this helps, John

edexter commented 1 year ago

Hi John,

Thanks for the reply. The situation in my case is a few haplotigs that weren't purged from one of the genome assemblies that I'm looking at. I see your point about correcting the issue further upstream in the pipeline rather than at the plotting stage, so that's the approach I'll take. Best wishes, -Eric