epam / Indigo

Universal cheminformatics toolkit, utilities and database search tools
http://lifescience.opensource.epam.com
Apache License 2.0
315 stars 105 forks source link

Import of modified IDT monomers #1899

Closed olganaz closed 4 months ago

olganaz commented 6 months ago

Background In IDT notation modified sequences represented as a plain strings with a combination of standard and modified monomers. Standard monomer [s]<Base>[*] is nucleotides with the same configurations as supported in Ketcher. Modified monomer /<pos><Identifier>/[*] could be recognized as one of the following:

This task covers only import of modified IDT monomers.

Requirements

  1. The system should interpret the following /<pos><Identifier>/[*] as the IDT alias of monomer from Ketcher library and import corresponding monomer.

    1. If there is no monomer with corresponding alias in a library, then system should Import IDT monomer as monomer with IDT alias only (no structure)
  2. The system should check the position of the monomer in a chain according to pos in IDT alias:

    • 5- at the 5' end (the first monomer in a chain)
    • i- inside the chain
    • 3 - at the 3' end (the last monomer in a chain)

      In case if position indicator in IDT code contradicts real position of the monomer in the chain, this should be treated as format error and import should fail with appropriate error message: IDT alias \<IDT id> cannot be used at five prime (was Position of monomer \<IDT id> in sequence contradicts its code but decided to change - approved by @olganaz)

  3. When * is implied to modified IDT monomer, system should check also whether RNA preset with IDT alias /<pos><Identifier>/ exists in a library

    • if there is an RNA preset with IDT alias /<pos><Identifier>/ then /<pos><Identifier>/* should be imported as RNA preset, in which phosphate (P) is changed to Phosphorothioate (sP) Import IDT_modified phosphate
  4. The bonds between monomers should be established from R2 attachment point of the first monomer to R1 attachment point of the second monomer.

Examples

AlexeyGirin commented 4 months ago

Verified.