UC-Davis-molecular-computing / scadnano

Web application for designing DNA structures such as DNA origami.
https://scadnano.org
MIT License
23 stars 13 forks source link

DNA sequences should be justified properly in exported SVG from Export-->SVG of selected strands #941

Closed dave-doty closed 7 months ago

dave-doty commented 1 year ago

Make this design:

image

and select the strand.

Select Export-->SVG of selected strands. Drag the saved SVG file to Powerpoint. It does not render the DNA sequences with the correct spacing:

image

dave-doty commented 1 year ago

The DNA sequences on domains are one SVG text element per domain. This is faster to render than having one SVG text element per base with the position of each letter hard-coded, but it means we need to be careful with the CSS styles and spacing so that each base appears next to the correct helix offset.

My guess is that the error here is due to CSS styles not being properly "inlined" when exporting SVG. What I mean by this is that in the web interface, there's a separate CSS style file (web/scadnano-styles.css) that is applied to the page, but when exporting SVG, these need to be put directly in the SVG file. There's code to do this for the SVG shapes (e.g., what's inside view/design_main_strand.dart). See the file middleware/export_svg.dart and the function get_cloned_svg_element_with_style. But perhaps it is not being done properly for the DNA sequences.

dave-doty commented 1 year ago

When I try this on the dev branch, the bug does not appear. I think I was talking to Aaron about some issue when I noticed this and maybe have been in another branch? I'm not sure. I'll leave this issue here for now in case it pops up after we merge other PRs (in case one of them is in a branch that had this issue and it re-appears) but we can close this issue if we don't see it after a while.

UPDATE: Sorry, I mispoke. The issue is on the dev branch, but it only shows up with the exact instructions I gave (export to SVG, then drag the saved file onto Powerpoint). In particular, the DNA sequences are styled correctly when you select Edit-->Copy/Paste/Select-->Copy Image and paste into Powerpoint.

The bug also shows up when exporting the whole main view to an SVG file, if you drag the SVG file onto Powerpoint.

It may be that this is a problem with Powerpoint's import of SVG files? Not sure.

When I open the exported SVG files in Inkscape, the DNA sequences are styled correctly:

image

rayzhuca commented 1 year ago

I think I tracked down the issue. The problem is with the textLength attribute in <text>.

For demonstration I edited the example svg and removed most irrelevant attributes. Then I modified the textLength attribute so the top text is longer than the bottom text.

svg viewer

Link to the SVG viewer

picture of svg in powerpoint

The imported svg on PowerPoint is displayed on the right.

As you can see, PowerPoint completely disregards the textLength attribute.

I believe that PowerPoint does not natively import an svg, and instead converts it into an image

inspect element shows svg is converted into an image

The source of the issue could be attributed to this conversion process.

There seems to be a common issue among Microsoft tools. Office cannot handle basic SVG text PowerPoint app does not display svg correctly MS PowerPoint distorted an imported svg image

dave-doty commented 1 year ago

Although MS imports it as an image, it does retain the vector graphics somehow, because you can right-click and select "Convert to shape":

image

and then it's editable as Powerpoint shapes.

Perhaps try to look for a minimal reproducible example, an SVG file not exported from scadnano. It could be that it's being caused by something else we are putting in the SVG file, not that Powerpoint always ignores the textLength attribute. (Though maybe it does, but let's confirm with smaller examples first.)

That said, it won't be surprising if this happens in Powerpoint always. When converting to a shape, it converts the SVG text to a text box in Powerpoint:

image

and I can't even figure out a way to manually space the letters out, so it could be that Powerpoint's text is simply less expressive than SVG in this regard.

Another option is to export individual letters instead of grouping DNA sequences by domains. This should not be done in the SVG rendered in the browser, because the text is one of the slowest things to render. If you have a large design, it slows down rendering the text so much that we actually cache it as a PNG when you zoom out.

But we could imagine re-rendering DNA in this way just for exporting SVG (but not for displaying it normally). We could have a function that goes through all of some SVG tree and replaces each text object with a series of text objects, one per base, positioned correctly.

I would want this to be a configurable option under the Export menu (a checkbox) so the user can choose whether or not the DNA SVG gets converted in this way. If they are opening the SVG in Inkscape, for example, it's cleaner for them if they can have all the DNA text in a single text object rather than spread across multiple ones.

rayzhuca commented 12 months ago

I did some more testing. It seems like textLength really is not being properly handled by Microsoft. Screenshot 2023-11-15 at 9 07 53 PM The SVG above is rendered as Screenshot 2023-11-15 at 9 07 40 PM On PowerPoint, Screenshot 2023-11-15 at 10 34 00 PM

However, I found that letter-spacing is an alternative that does work on PowerPoint. An equation that relates textLength with letterSpacing is

textLength = charWidth * characterCount + letterSpacing * (characterCount - 1)

I found the exact charWidth by using this snippet https://stackoverflow.com/a/74375386. Finally, I set

letterSpacing = '${(text_length - charWidth * seq_to_draw.length) / (seq_to_draw.length - 1)}'

to the DNA sequence text attribute.

Screenshot 2023-11-15 at 10 27 55 PM The letterSpacing approach on a web browser I didn't notice any alignment drifting for a strand with 300 bases.

However, when I imported the SVG into PowerPoint, Screenshot 2023-11-15 at 10 29 58 PM There is a noticeable error because I don't have Consolas font installed (and letterSpacing depends on charWidth). (I also don't have PowerPoint app; I'm using the webapp.)

Should we keep the textSpacing approach or keep going with the individual letters (or both)? I pushed the branch for reference 941-dna-sequences-justified.

dave-doty commented 12 months ago

I'm not entirely sure what to do actually. Maybe let's have a meeting to discuss this and brainstorm possible approaches.

dave-doty commented 11 months ago

I figured out how to manually configure the letter spacing in Powerpoint. Select the text, right-click and select "Font":

image

This doesn't really help us solve the problem, but I wanted to check for myself how Powerpoint's import could adjust the letter spacing.

dave-doty commented 11 months ago

Just to record what we discussed in the meeting for how to solve this problem:

  1. Introduce an option in the menu somewhere (I'd recommend in the Export menu) for making each individual DNA symbol a separate SVG text element, instead of the current approach of making a single text element for all the DNA symbols for a given Domain. The default should be the current however (one text element per Domain). This gives a way for "power users" to export and hopefully have better letter rendering, but we hope that the default works for most people. (described below)
  2. Change the font in the export to a fixed-width font installed by default on all major OS's, I'd assume Courier New should work for this. But keep using Consolas for the web interface. This can probably be done by adjusting the code that "inlines" the CSS styles when exporting SVG.

Hopefully the default then works for most everyone, but power users can try the other setting for option 1 above in case it doesn't.

Also, if possible, try to do this without changing how the PNG image export works. In other words, if they select some strands and press Ctrl+I (or Edit-->Copy/Paste/Select-->Copy image) it should work identically to how it does now. So if you are reusing code between the cases, this might involve adding some Boolean arguments to existing functions to tell them whether to change the font on export or not. (If they do "Copy image" then they should still see Consolas font.)

rayzhuca commented 11 months ago

I've added a checkbox that renders each text separately. I've faced some new issues. The first is that each text looks slightly misaligned (it's a little lower) because PowerPoint doesn't support dominant-baseline for text. PowerPoint:

Screenshot 2023-12-18 at 12 07 54 PM

What it's supposed to look like:

Screenshot 2023-12-18 at 12 08 07 PM

Here is the SVG that does render each text separately: 2_staple_2_helix_origami_deletions_insertions_mods_selected (4) Here it is in PowerPoint,

Screenshot 2023-12-18 at 12 06 33 PM

Should I just rename the text box to be "Power user" and then add the vertical offset manually?

I also noticed that PowerPoint doesn't render text paths so insertions and loopouts don't even appear.

dave-doty commented 10 months ago

Hmm, it's not ideal, but the exported SVG of the DNA sequences of domain looks fine, so I'm happy to just keep it like that even though the offsets are a bit different than in the web interface.

The insertion/loopout issue is another story. One fix for insertions would be to do those manually as well.

It could be a nightmare to place the insertions manually if we try to handle arbitrary values for the insertion length. But in practice, people use small values. So if it's easier, we could do the following:

  1. For insertion length values between 1 and 4 (meaning the number of displayed bases is between 2 and 5 since insertion length is the number of extra bases to display), we could hard-code the positions to make them look similar to the current appearance.
  2. For larger values, we could just display the DNA sequence in a straight line (i.e., with a regular text element) above the insertion (or below for reverse domains). It would look too crowded in a textpath along the insertion path anyway.

For loopouts, I don't know. They can be long, so we don't want to do that manually for each possible number of bases. This is something where we probably need a power user option where we just export like normal in case they are really just going to use the SVG file directly, but with an option to display the loopout DNA sequence as a straight line adjacent to the loopout, with a tooltip stating that this option can be used for Powerpoint users, since it won't display the loopout sequences correctly for that.

So maybe we want to start popping up a dialog to ask for export options.

Let's discuss this one in a Zoom meeting at some point and go over the options.