tpoorten / dotPlotly

Generate an interactive dot plot from mummer or minimap alignments
MIT License
188 stars 52 forks source link

Input File Requirements #1

Closed tramaraj closed 6 years ago

tramaraj commented 6 years ago

Dear Author/Users:

I am trying to use mummerCoordsDotPlotly.R on a NUCMER .coords output and running into some issues.

Here is a few lines of the .coords file I have,

`line1.genome.assembly.V1.fasta line2.genome.assembly.V1.fasta NUCMER

[S1]     [E1]  |     [S2]     [E2]  |  [LEN 1]  [LEN 2]  |  [% IDY]  | [TAGS]

===================================================================================== 1 1736 | 4279333 4277615 | 1736 1719 | 94.85 | PGA_scaffold0825_contigslength_139475317 PGA_scaffold19_contigslength_60359544 427 718 | 17455626 17455914 | 292 289 | 86.99 | PGA_scaffold0825_contigslength_139475317 PGA_scaffold020_contigslength_64265851 2908 3584 | 4276560 4275896 | 677 665 | 86.58 | PGA_scaffold0825_contigslength_139475317 PGA_scaffold19_contigslength_60359544 4385 4779 | 4275187 4274803 | 395 385 | 89.20 | PGA_scaffold0825_contigslength_139475317 PGA_scaffold19_contigslength_60359544 12093 12585 | 4270189 4269690 | 493 500 | 85.69 | PGA_scaffold0825_contigslength_139475317 PGA_scaffold19_contigslength_60359544 13788 15045 | 4268199 4266924 | 1258 1276 | 86.48 | PGA_scaffold0825_contigslength_139475317 PGA_scaffold19_contigslength_60359544`

Here is a few lines from the example in GitHub

`/home/tpoorten/Learning/mummer/Brassica_rapa.faa /home/tpoorten/Learning/mummer/Brassica_napus_rape.faa NUCMER

[S1]     [E1]  |     [S2]     [E2]  |  [LEN 1]  [LEN 2]  |  [% IDY]  |  [COV R]  [COV Q]  | [TAGS]

========================================================================================================== 54478 55382 | 2422751 2423655 | 905 905 | 98.01 | 0.00 0.00 | lcl|A01 lcl|A01 55516 55699 | 2423789 2423972 | 184 184 | 100.00 | 0.00 0.00 | lcl|A01 lcl|A01 59409 59859 | 2427581 2428031 | 451 451 | 99.78 | 0.00 0.00 | lcl|A01 lcl|A01 60376 60495 | 2428548 2428667 | 120 120 | 100.00 | 0.00 0.00 | lcl|A01 lcl|A01 61121 63544 | 2429293 2431716 | 2424 2424 | 92.95 | 0.01 0.01 | lcl|A01 lcl|A01 63632 63905 | 2431804 2432077 | 274 274 | 90.88 | 0.00 0.00 | lcl|A01 lcl|A01 65026 65184 | 2433198 2433356 | 159 159 | 100.00 | 0.00 0.00 | lcl|A01 lcl|A01`

Couple of things to note,

  1. My .coords file is missing the following columns,

[COV R] [COV Q]

Will that be an issue?

  1. The sequence names seems to be the same from ref and query, in my case the for the two genomes I am comparing have different names. Is it a requirement that the names has to be similair?

Any help/comment/suggestions would be greatly appreciated.

Thanks!

tpoorten commented 6 years ago

I believe this is parsing issue stemming from my mummer pipeline where I added the -c parameter to the show-coords command. If you re-generate the coords file with show-coords -c, this should fix the parsing issue in my R script.

tramaraj commented 6 years ago

YES! When I run NUCMER and ask for a .coords file for some reason the [COV R] [COV Q] are missing. But I took the .delta file and used show-coords program and generated a .coords file and it seems to have the addition columns and dotPlotly is happy with it. I am able to get results from the dotPlotly R scripts. But everytime I run it I get this error message,

Warning: Ignoring unknown aesthetics: text Error in htmlwidgets::saveWidget(as.widget(gply), file = paste0(opt$output_filename, : Saving a widget with selfcontained = TRUE requires pandoc. For details see: https://github.com/rstudio/rmarkdown/blob/master/PANDOC.md No traceback available

Looks like it is not affecting the results....but thought I would ask FWIW!

Thanks!

tpoorten commented 6 years ago

Looks like an install issue with htmlswidgets and/or pandoc. Or you can turn off the interactive plot generation by adding -x when running my script.