glygener / glyTableMaker-backend

Backend code for the GlyGen Table Maker application
GNU General Public License v3.0
0 stars 0 forks source link

Share composition wurcs #69

Closed ReneRanzinger closed 4 months ago

ReneRanzinger commented 5 months ago

Please share a wurcs string for a composition you create. Maybe Hex3Fuc1HexNac2

senaarpinar commented 5 months ago

For the composition: Hex3HexNAc2Fuc1, I get the following:

WURCS=2.0/3,6,5/[axxxxh-1x_1-5_2*NCC/3=O][axxxxh-1x_1-5][a1221m-1x_1-5]/1-1-2-2-2-3/a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?

ReneRanzinger commented 5 months ago

Should be: WURCS=2.0/3,6,5/[uxxxxm][uxxxxh_2*NCC/3=O][uxxxxh]/1-2-2-3-3-3/a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?

What is the code for creating this monosaccharides? There should be an unknown anomer and unknown ring size.

senaarpinar commented 5 months ago

I found this project to convert compositions to WURCS and using the example code from there.

https://gitlab.com/glycoinfo/glycompconverter/-/blob/master/README.md?ref_type=heads

ReneRanzinger commented 5 months ago

Hi Rene,

Thank you for sharing the details.

The GlycanCompositionConverter actually uses GlycoCTToWURCS internally, and due to its specifications, it is unable to output the aldehyde/hemiacetal version. To address this, further processing is required to convert the WURCS output from the current GlycanCompositionConverter.

The following code implements this process. It's a bit complex but please refer to it for guidance:


import org.glycoinfo.GlycanCompositionConverter.conversion.CompositionConverter;

import org.glycoinfo.GlycanCompositionConverter.structure.Composition;

import org.glycoinfo.GlycanCompositionConverter.utils.CompositionUtils;

import org.glycoinfo.WURCSFramework.util.WURCSException;

import org.glycoinfo.WURCSFramework.util.WURCSFactory;

import org.glycoinfo.WURCSFramework.util.subsumption.WURCSSubsumptionConverter;

import org.glycoinfo.WURCSFramework.wurcs.array.LIN;

import org.glycoinfo.WURCSFramework.wurcs.array.MS;

import org.glycoinfo.WURCSFramework.wurcs.array.RES;

import org.glycoinfo.WURCSFramework.wurcs.array.UniqueRES;

import org.glycoinfo.WURCSFramework.wurcs.array.WURCSArray;

public class TestCompositionParser {

  public static void main(String[] args) throws Exception {

    String strComposition = "Hex:3|HexNAc:2";

    Composition compo = CompositionUtils.parse(strComposition);

    String strWURCS = CompositionConverter.toWURCS(compo);

    System.out.println(strWURCS);

    // WURCS=2.0/2,5,4/[axxxxh-1x_1-5_2*NCC/3=O][axxxxh-1x_1-5]/1-1-2-2-2/a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?

    // convert to unknown form (mixture of ring and open chain forms)

    String strWURCSUnknownForm = toUnknownForm(strWURCS);

    System.out.println(strWURCSUnknownForm);

    // WURCS=2.0/2,5,4/[uxxxxh_2*NCC/3=O][uxxxxh]/1-1-2-2-2/a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?

  }

  private static String toUnknownForm(String strWURCS) throws WURCSException {

    WURCSFactory factory = new WURCSFactory(strWURCS);

    WURCSArray array = factory.getArray();

    // copy array without UniqueRES

    WURCSArray arrayNew = new WURCSArray(

        array.getVersion(),

        array.getUniqueRESCount(),

        array.getRESCount(),

        array.getLINCount(),

        array.isComposition()

    );

    for (RES res : array.getRESs())

      arrayNew.addRES(res);

    for ( LIN lin : array.getLINs() )

      arrayNew.addLIN(lin);

    // add UniqueRES converted to unknown form

    WURCSSubsumptionConverter converter = new WURCSSubsumptionConverter();

    for (UniqueRES ures : array.getUniqueRESs()) {

      MS ms = converter.convertAnomericCarbonToUncertain(ures);

      UniqueRES uresNew = new UniqueRES(ures.getUniqueRESID(), ms);

      arrayNew.addUniqueRES(uresNew);

    }

    // back to WURCS

    WURCSFactory factoryNew = new WURCSFactory(arrayNew);

    return factoryNew.getWURCS();

  }

}

Note that [uxxxxm] is dHex, not Fuc or Hex. So if you want to get [uxxxxm], please use dHex instead of Fuc.

Best, Masaaki

senaarpinar commented 4 months ago

added 3 composition type options and used the above code for one of the options.