MonomerLibrary / monomers

dictionary of monomers and links
GNU Lesser General Public License v3.0
14 stars 85 forks source link

remove obsoleted/nonexistent monomers in the CCD #30

Closed keitaroyam closed 1 year ago

keitaroyam commented 1 year ago

These monomers are flagged as OBS in the CCD

7DI 9I7 BAM CH2 DUM H P1P S TDP TNY X4Z

and these do not exist in the CCD

0D5 0E2 0E3 0E4 0PN 2M6 2SD 2XV 2Z9 3PV 8VQ 8VZ 95A A67 A9A AC AD0 AEB AIP AMK ARY AS5 AT ATN AUA BAD BE BGA BGB BI BMY CBS CCG CGC CHY CST DBL DLQ DPI DVV DXI E19 EOV EPX ER F3G FGK FR FRN FXP GE GEA GFG GXX HE HF HS2 IHN KD1 KHO KTS L42 M13 M5G M5S MDM MFS MGU MOQ N31 NB NDL NE NSH P3M PAA PM PO PVD RA RAG RIG RN SI SN SUM SVC SYB TA TIP TM TXX UNF UNG URG VAG VAM VAN VAX VRS WSC WZ3 WZ5 X9A XD3 XLM

We should keep DUM and single-atom entries, so here we delete

7DI 9I7 BAM CH2 P1P TDP TNY X4Z 0D5 0E2 0E3 0E4 0PN 2M6 2SD 2XV 2Z9 3PV 8VQ 8VZ 95A A67 A9A AD0 AEB AIP AMK ARY AS5 ATN AUA BAD BGA BGB BMY CBS CCG CGC CHY CST DBL DLQ DPI DVV DXI E19 EOV EPX F3G FGK FRN FXP GEA GFG GXX HS2 IHN KD1 KHO KTS L42 M13 M5G M5S MDM MFS MGU MOQ N31 NDL NSH P3M PAA PVD RAG RIG SUM SVC SYB TIP TXX UNF UNG URG VAG VAM VAN VAX VRS WZ3 WZ5 X9A XD3 XLM

Most of them are polypeptides or polysaccharides, so I suppose they have been split into monomers.

GaribMurshudov commented 1 year ago

YEs, we should remove all obsolete monomers and remove anything that is not in the CCD. (apart from links and modifications). Links and modications might also need to be update.

keitaroyam commented 1 year ago

The automatic test also checks modifications/links referring to undefined monomers. So they should be fine.

keitaroyam commented 1 year ago

It will be helpful. But the situation will be completely the same as unknown monomers and users would be just encouraged to create restraints. And if the compound is what has been split into multiple monomers, it should be even encouraged to use them?