epam / Indigo

Universal cheminformatics toolkit, utilities and database search tools
http://lifescience.opensource.epam.com
Apache License 2.0
315 stars 105 forks source link

Macro(Import FASTA): The ' * ' symbol occurring between two letters is not recognized as a break in peptide chain #1849

Closed Zhirnoff closed 6 months ago

Zhirnoff commented 6 months ago

Steps to Reproduce

  1. Switch to Macro mode
  2. Paste FASTA sequence with '*' between two letters and in the end (Not right away, but one by one)
    >M18404.1 Human IgG2 lambda antibody (1B8.env reactive) gp41 coding region DNA
    GCAGTGGGAAGAGGAGCTTTGTTCCTTGGGTTCT*TGGGAGCAGCAGGAAGCACTATGGGCGCAGCCTCAA
    >M18404.1 Human IgG2 lambda antibody (1B8.env reactive) gp41 coding region DNA
    GCAGTGGGAAGAGGAGCTTTGTTCCTTGGGTTCTTGGGAGCAGCAGGAAGCACTATGGGCGCAGCCTCAA*

Actual behavior The ' * ' symbol occurring between two letters is not recognized as a break in peptide chain

Expected behavior The ' ' symbol occurring between two letters is recognized as a break in peptide chain (no bond should be created between monomers separated with the ""). "*" means the end of the peptide sequence https://github.com/epam/Indigo/issues/1755

Desktop (please complete the following information):

Indigo version [Version 1.19.0-rc.1]

Attachments 2024-03-19_16h24_47 2024-03-19_16h27_10 2024-03-19_16h27_17

AlexeyGirin commented 6 months ago

Fixed. image