Closed ReneRanzinger closed 4 months ago
For the composition: Hex3HexNAc2Fuc1, I get the following:
WURCS=2.0/3,6,5/[axxxxh-1x_1-5_2*NCC/3=O][axxxxh-1x_1-5][a1221m-1x_1-5]/1-1-2-2-2-3/a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?
Should be: WURCS=2.0/3,6,5/[uxxxxm][uxxxxh_2*NCC/3=O][uxxxxh]/1-2-2-3-3-3/a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?_a?|b?|c?|d?|e?|f?}-{a?|b?|c?|d?|e?|f?
What is the code for creating this monosaccharides? There should be an unknown anomer and unknown ring size.
I found this project to convert compositions to WURCS and using the example code from there.
https://gitlab.com/glycoinfo/glycompconverter/-/blob/master/README.md?ref_type=heads
Hi Rene,
Thank you for sharing the details.
The GlycanCompositionConverter actually uses GlycoCTToWURCS internally, and due to its specifications, it is unable to output the aldehyde/hemiacetal version. To address this, further processing is required to convert the WURCS output from the current GlycanCompositionConverter.
The following code implements this process. It's a bit complex but please refer to it for guidance:
import org.glycoinfo.GlycanCompositionConverter.conversion.CompositionConverter;
import org.glycoinfo.GlycanCompositionConverter.structure.Composition;
import org.glycoinfo.GlycanCompositionConverter.utils.CompositionUtils;
import org.glycoinfo.WURCSFramework.util.WURCSException;
import org.glycoinfo.WURCSFramework.util.WURCSFactory;
import org.glycoinfo.WURCSFramework.util.subsumption.WURCSSubsumptionConverter;
import org.glycoinfo.WURCSFramework.wurcs.array.LIN;
import org.glycoinfo.WURCSFramework.wurcs.array.MS;
import org.glycoinfo.WURCSFramework.wurcs.array.RES;
import org.glycoinfo.WURCSFramework.wurcs.array.UniqueRES;
import org.glycoinfo.WURCSFramework.wurcs.array.WURCSArray;
public class TestCompositionParser {
public static void main(String[] args) throws Exception {
String strComposition = "Hex:3|HexNAc:2";
Composition compo = CompositionUtils.parse(strComposition);
String strWURCS = CompositionConverter.toWURCS(compo);
System.out.println(strWURCS);
// WURCS=2.0/2,5,4/[axxxxh-1x_1-5_2*NCC/3=O][axxxxh-1x_1-5]/1-1-2-2-2/a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?
// convert to unknown form (mixture of ring and open chain forms)
String strWURCSUnknownForm = toUnknownForm(strWURCS);
System.out.println(strWURCSUnknownForm);
// WURCS=2.0/2,5,4/[uxxxxh_2*NCC/3=O][uxxxxh]/1-1-2-2-2/a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?_a?|b?|c?|d?|e?}-{a?|b?|c?|d?|e?
}
private static String toUnknownForm(String strWURCS) throws WURCSException {
WURCSFactory factory = new WURCSFactory(strWURCS);
WURCSArray array = factory.getArray();
// copy array without UniqueRES
WURCSArray arrayNew = new WURCSArray(
array.getVersion(),
array.getUniqueRESCount(),
array.getRESCount(),
array.getLINCount(),
array.isComposition()
);
for (RES res : array.getRESs())
arrayNew.addRES(res);
for ( LIN lin : array.getLINs() )
arrayNew.addLIN(lin);
// add UniqueRES converted to unknown form
WURCSSubsumptionConverter converter = new WURCSSubsumptionConverter();
for (UniqueRES ures : array.getUniqueRESs()) {
MS ms = converter.convertAnomericCarbonToUncertain(ures);
UniqueRES uresNew = new UniqueRES(ures.getUniqueRESID(), ms);
arrayNew.addUniqueRES(uresNew);
}
// back to WURCS
WURCSFactory factoryNew = new WURCSFactory(arrayNew);
return factoryNew.getWURCS();
}
}
Note that [uxxxxm] is dHex, not Fuc or Hex. So if you want to get [uxxxxm], please use dHex instead of Fuc.
Best, Masaaki
added 3 composition type options and used the above code for one of the options.
Please share a wurcs string for a composition you create. Maybe Hex3Fuc1HexNac2