tpoorten / dotPlotly

Generate an interactive dot plot from mummer or minimap alignments
MIT License
188 stars 52 forks source link

Fix for issue #18 (add .svg and .pdf support) #23

Open lonelyjoeparker opened 7 months ago

lonelyjoeparker commented 7 months ago

…dotPlotly/issues/18)

Issue raised by @sven-winter and @jwindler suggested altering ggsave() argument to .pdf.

I implemented this including adding CLI options to the parser: --pdf-plot-on and --svg-plot-on. Defaults for both args are set to to FALSE to be consistent with existing behaviour. I chose long-form arguments as the obvious single letter (-f -v -g -p) forms were either already taken or ambiguous in meaning.

Consistent with the existing README.md but I also added a simpler example (MPA01-vs-PA01.minimap2.paf) at https://github.com/tpoorten/dotPlotly/compare/master...lonelyjoeparker:dotPlotly-issue-18:master#diff-cbccb4caeac8ae21fecaaabaab0b3548186ceeb961b37d23a167beeaf7d2dfb4 that executes more quickly, to facilitate testing; and a .gitignore file to exclude the various outputs from the repo (again, so users/coders can verify example below is working).

Fix and testing:

This now has the following behaviour which I believe validates this PR as working:

DEFAULT CASE: No .pdf or .svg enabled:

joe$ ./pafCoordsDotPlotly.R -i example/MPA01-vs-PA01.minimap2.paf -o out -s -t -m 500 -q 500000 -k 7 -l
PARAMETERS:
input (-i): example/MPA01-vs-PA01.minimap2.paf
output (-o): out
minimum query aggregate alignment length (-q): 5e+05
minimum alignment length (-m): 500
plot size (-p): 15
show horizontal lines (-l): TRUE
number of reference chromosomes to keep (-k): 7
show % identity (-s): TRUE
show % identity for on-target alignments only (-t): TRUE
produce interactive plot (-x): TRUE
produce .pdf plot (--pdf-plot-on): FALSE
produce .svg plot (--svg-plot-on): FALSE
reference IDs to keep (-r): 

Number of alignments: 5
Number of query sequences: 1

After filtering... Number of alignments: 4
After filtering... Number of query sequences: 1

joe$ ls -laht
total 4.0M
drwxrwxr-x  8 joe joe 4.0K Jan 17 13:27 .git
drwxrwxr-x  5 joe joe 4.0K Jan 17 13:27 .
-rw-rw-r--  1 joe joe 3.7M Jan 17 13:27 out.html
-rw-rw-r--  1 joe joe 166K Jan 17 13:27 out.png
drwxrwxr-x  2 joe joe 4.0K Jan 17 13:27 example
-rwxrwxr-x  1 joe joe  17K Jan 17 13:00 pafCoordsDotPlotly.R
-rwxrwxr-x  1 joe joe  15K Jan 17 13:00 mummerCoordsDotPlotly.R
drwxrwxr-x  2 joe joe 4.0K Jan 17 13:00 dotPlotly_shiny
-rw-rw-r--  1 joe joe 1.1K Jan 17 13:00 LICENSE.md
-rw-rw-r--  1 joe joe 2.3K Jan 17 13:00 README.md
drwxr-xr-x 34 joe joe 4.0K Jan 17 12:59 ..

POSITIVE case: .pdf and .svg both enabled:

joe$ ./pafCoordsDotPlotly.R -i example/MPA01-vs-PA01.minimap2.paf -o out -s -t -m 500 -q 500000 -k 7 -l --pdf-plot-on --svg-plot-on
PARAMETERS:
input (-i): example/MPA01-vs-PA01.minimap2.paf
output (-o): out
minimum query aggregate alignment length (-q): 5e+05
minimum alignment length (-m): 500
plot size (-p): 15
show horizontal lines (-l): TRUE
number of reference chromosomes to keep (-k): 7
show % identity (-s): TRUE
show % identity for on-target alignments only (-t): TRUE
produce interactive plot (-x): TRUE
produce .pdf plot (--pdf-plot-on): TRUE
produce .svg plot (--svg-plot-on): TRUE
reference IDs to keep (-r): 

Number of alignments: 5
Number of query sequences: 1

After filtering... Number of alignments: 4
After filtering... Number of query sequences: 1

 joe$ ls -laht
total 4.0M
drwxrwxr-x  8 joe joe 4.0K Jan 17 13:34 .git
drwxrwxr-x  5 joe joe 4.0K Jan 17 13:34 .
-rw-rw-r--  1 joe joe 3.7M Jan 17 13:34 out.html
-rw-rw-r--  1 joe joe 5.3K Jan 17 13:34 out.svg
-rw-rw-r--  1 joe joe 5.1K Jan 17 13:34 out.pdf
-rw-rw-r--  1 joe joe 166K Jan 17 13:34 out.png
-rw-rw-r--  1 joe joe    5 Jan 17 13:28 .gitignore
drwxrwxr-x  2 joe joe 4.0K Jan 17 13:27 example
-rwxrwxr-x  1 joe joe  17K Jan 17 13:00 pafCoordsDotPlotly.R
-rwxrwxr-x  1 joe joe  15K Jan 17 13:00 mummerCoordsDotPlotly.R
drwxrwxr-x  2 joe joe 4.0K Jan 17 13:00 dotPlotly_shiny
-rw-rw-r--  1 joe joe 1.1K Jan 17 13:00 LICENSE.md
-rw-rw-r--  1 joe joe 2.3K Jan 17 13:00 README.md
drwxr-xr-x 34 joe joe 4.0K Jan 17 12:59 ..

(NB, tested but not shown - alternately toggling .pdf OR .svg behaviour works as expected too)

HELP -h behaviour:

joe$ ./pafCoordsDotPlotly.R -h
Usage: ./pafCoordsDotPlotly.R -i alignments.coords -o out [options]

Options:
    -i INPUT, --input=INPUT
        coords file from mummer program 'show.coords' [default NULL]

    -o OUTPUT, --output=OUTPUT
        output filename prefix [default out]

    -v, --verbose
        Print out all parameter settings [default]

    -q MIN-QUERY-LENGTH, --min-query-length=MIN-QUERY-LENGTH
        filter queries with total alignments less than cutoff X bp [default 4e+05]

    -m MIN-ALIGNMENT-LENGTH, --min-alignment-length=MIN-ALIGNMENT-LENGTH
        filter alignments less than cutoff X bp [default 10000]

    -p PLOT-SIZE, --plot-size=PLOT-SIZE
        plot size X by X inches [default 15]

    -l, --show-horizontal-lines
        turn on horizontal lines on plot for separating scaffolds  [default FALSE]

    -k NUMBER-REF-CHROMOSOMES, --number-ref-chromosomes=NUMBER-REF-CHROMOSOMES
        number of sorted reference chromosomes to keep [default all chromosmes]

    -s, --identity
        turn on color alignments by % identity [default FALSE]

    -t, --identity-on-target
        turn on calculation of % identity for on-target alignments only [default FALSE]

    -x, --interactive-plot-off
        turn off production of interactive plotly [default TRUE]

    -r REFERENCE-IDS, --reference-ids=REFERENCE-IDS
        comma-separated list of reference IDs to keep [default NULL]

    --pdf-plot-on
        turn on production of .PDF format plotly [default FALSE]

    --svg-plot-on
        turn on production of .SVG format plotly [default FALSE]

    -h, --help
        Show this help message and exit