UC-Davis-molecular-computing / scadnano-python-package

Python scripting library for generating designs readable by scadnano.
https://scadnano.org
MIT License
14 stars 7 forks source link

cadnano import should use helices_view_order to preserve the main view ordering of helices as they appear in cadnano #202

Open dave-doty opened 2 years ago

dave-doty commented 2 years ago

See https://github.com/UC-Davis-molecular-computing/scadnano/issues/673.

Import this cadnano design. It appears the helices have incorrect parity:

image

leading the design to have large spacing between every other pair of helices:

image

Presumably the original cadnano design had all these helices spaced equally and on top of each other in the square lattice, in order 0, 1, 2, 3, ...

tcosmo commented 2 years ago

I opened the file with cadnano v2 and this is the ordering in cadnanov2 as well.

This is because of cadnano's UI behavior when you create a design: the first helix you create is 0 and the second is 1 no matter of their relative positions on the grid. All possibilites are allowed: noth of 0 is 1, south of 0 is 1, east of 0 is 1, west of 0 is 1.

Screenshot from 2021-11-14 13-59-20 Screenshot from 2021-11-14 13-58-38 Screenshot from 2021-11-14 14-00-34

dave-doty commented 2 years ago

Ah, I see. This is a difference between the layout algorithms used in scadnano versus cadnano.

scadnano displays in order of helix index by default, but it attempts to be a bit more clever with the spacing, in order to help visualize helices in 3D structures. One implication is that the vertical distance between helices in the main view of scadnano is proportional to their Euclidean distance in the side view; for example, see Figure 6 here.

So in the side view, if you have

image

then 0-1 and 2-3 are adjacent in the side view (3 nm between their centers by default), but 1 is not adjacent to 2, they will be 3·3 = 9 nm apart in the main view (i.e., with room enough to fit two other helices between them, which is how many are between them in the side view):

image

See here for a description of the layout algorithm: https://github.com/UC-Davis-molecular-computing/scadnano#relation-of-grid_position-and-position-to-side-and-main-view-display

Closing as not a bug.

Fix

Edit the helices_view_order property (this is associated to something called a "helix group", but by default there's just one group.) Click on Group→adjust current group:

image

Edit the "helices view order" to be the order they appear in the side view:

image

If you want to copy-paste, that's

1 0 3 2 5 4 7 6 9 8 11 10 13 12 15 14 17 16 19 18 21 20 23 25 22 27 24 29 26 31 28 33 30 35 37

Be careful, because the helix ordering isn't just swapping every pair of indices. It does that until helix 23, and then it seems fairly scattershot below there:

image

Note that as of right now, there's this other annoying bug that means you will have to refresh the page after changing the helices view order (it won't update the view after you click the OK button, but it will update the design stored in your browser's local storage): https://github.com/UC-Davis-molecular-computing/scadnano/issues/677

After refreshing the page, the helices look as they do in the cadnano design:

image

other than helix 37, which is way down on the bottom. I'd suggest moving it to be closer to the bottommost helix.

image

dave-doty commented 2 years ago

Okay, after playing with the "fix" for a bit, I decided this isn't very straightforward, so I'm re-opening the issue and deciding we should adjust the cadnano import to handle this.

What the user will most expect is to see the helices in the main view in the order they are used to seeing. Hopefully it is as straightforward as setting the default helix group's helices_view_order field as I suggested in the fix. But it might be tricky to figure out how cadnano handles helices that are not vertically stacked in the side view (e.g., helix 37 in this design).

While we're at it, I'm not sure why there's a gap between helices 23 and 25 in the scadnano design. I can't see the cadnano design, so I'm not sure what's happening around those helices.

tcosmo commented 2 years ago

I'm not sure what's happening either! The hole is in the cadnano design, not sure why

Screenshot from 2021-11-14 20-49-57

Screenshot from 2021-11-14 20-50-44

I think it is rather a good and important thing that the cadnano -> scadnano feature reproduces exactly what is being seen in cadnano. That way there are no hidden algorithm for the user to think about and makes it easier to debug designs. It keeps things simple

dave-doty commented 2 years ago

I'm not sure what's happening either! The hole is in the cadnano design, not sure why

Ah, I didn't realize that. I thought scadnano import was introducing the hole.

I think it is rather a good and important thing that the cadnano -> scadnano feature reproduces exactly what is being seen in cadnano. That way there are no hidden algorithm for the user to think about and makes it easier to debug designs. It keeps things simple

Yes, I agree. I think the helices_view_order will be a good way for the scadnano import to reproduce the order in which helices appear in the cadnano design.

tcosmo commented 2 years ago

Dave, I have thorougly checked the code of Design constructor and the order of the helices IS lost even if the helices were give as a list.

Indeed, you very early call _normalize_helices_as_dict in the constructor the order is lost. Later in the code, if no view order was given the code will choose the identity in function _check_helices_view_order_and_return.

Conclusion: do not assume that your constructor code deals with the ordering, even if input helices is a list.

Hence, to solve this issue I uncommented my piece of code which set the helice view order inside the import function.

I also added a test test_helices_order2.

All of this is in 6170f983864b7ce20f00271d3019e1a2169a5f58.