UC-Davis-molecular-computing / scadnano-python-package

Python scripting library for generating designs readable by scadnano.
https://scadnano.org
MIT License
13 stars 7 forks source link

shared binding domains #248

Open dave-doty opened 1 year ago

dave-doty commented 1 year ago

Throughout most of DNA nanotechnology, the word "domain" is used to mean "shared binding domain", i.e., if two different strands share a domain as a subsequence, then they have the same DNA sequence on that domain (though they may be different strands). This is how nuad interprets a Domain (https://github.com/UC-Davis-molecular-computing/nuad#data-model). Implementing a similar concept in scadnano would allow tighter integration with nuad. It would also enable easier design, even by hand, by allowing one to assign a DNA sequence to a shared domain rather than to individual strands.

This means the word "Domain" was a poor choice for what scadnano refers to as a Domain, which is just a continuous region of a strand on a single Helix.

Come up with a new name, either for the old Domain (e.g., BoundSubstrand) or the new concept (e.g., SharedDomain). In this issue we'll use the original scadnano terminology and use SharedDomain for the new concept.

Allow SharedDomains to be assigned to Strands and/or Substrands (Domains, Extensions, and Loopouts). If such an assignment is made, then DNA sequences should be assigned to the SharedDomains as opposed to the Strand or its Substrands. Furthermore, if any other instances of the same SharedDomain (i.e., it has the same name) occur elsewhere in the Design, they have the same sequence as well.

It's not exactly clear how best to do this assignment. Most designs would simply assign 1-1 between a SharedDomain and a Substrand. However, this may not always be the case. The nuad concept of domains (https://github.com/UC-Davis-molecular-computing/nuad#data-model) allows hierarchical subdomains. This would be useful if we want to allow more than one SharedDomain in a single Substrand. However, one may also which a single domain to span consecutive Substrands.

The simplest way to have maximum flexibility is to simply assign each Strand a list of SharedDomains, each with some defined length, such that the sum of their lengths equals the Strand's length.