opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Ambiguous drug names mapping to multiple IDs #3217

Open ireneisdoomed opened 9 months ago

ireneisdoomed commented 9 months ago

Describe the bug Multiple compounds in ChEMBL have the same name, when they actually refer to different compounds.

Observed behaviour In the molecule dataset (derived from ChEMBL data) we have 125 chemicals where the same name map to multiple IDs. Usually a ID in ChEMBL represents a single chemical structure defined by its inchiKey. If we take carbidopa as an example, we have 2 IDs for this entity:

The latter structure is the one that it is commonly referred to as carbidopa, while the former is carbidopa monohydrate. Since these are 2 chemically different structures, this difference should be reflected in the name as well.

Full list of cases ``` +------------------------------------------------+------------------------------------------+---------------------------------------------------------------------------------------+ |name |ids |inchiKeys | +------------------------------------------------+------------------------------------------+---------------------------------------------------------------------------------------+ |552-02 |[CHEMBL4297254, CHEMBL212432] |[NTRKMGDUWYBLMS-UHFFFAOYSA-N, ZYOATBKPEFZNGP-UHFFFAOYSA-N] | |ADL-5747 |[CHEMBL4297345, CHEMBL561339] |[ALGHKWSXJUQNJJ-UHFFFAOYSA-N, XROFSGVHHSJMMG-UHFFFAOYSA-N] | |AFIMOXIFENE |[CHEMBL489, CHEMBL10041] |[TXUZVZSFRXZGTL-QPLCGJKRSA-N, DODQJNMQWMSYGS-QPLCGJKRSA-N] | |ALUMINUM CHLORIDE |[CHEMBL3833401, CHEMBL3833314] |[JGDITNMASUZKPW-UHFFFAOYSA-K, VSCWAEJMTAWNJL-UHFFFAOYSA-K] | |AMMONIUM TETRATHIOMOLYBDATE |[CHEMBL4301078, CHEMBL3775741] |[] | |AMMONIUM TRICHLORO(DIOXOETHYLENE-O,O'-)TELLURATE|[CHEMBL3392104, CHEMBL3546168] |[ITUKXRCBQPSCQU-UHFFFAOYSA-N, VSJCJBQENUBFJC-UHFFFAOYSA-N] | |AMMONIUM TRICHLOROTELLURATE |[CHEMBL4597200, CHEMBL4584965] |[DXZFFLRJVDZCMT-UHFFFAOYSA-N, NTCZXUFXVRQSMX-UHFFFAOYSA-N] | |ARSENIC TRIOXIDE |[CHEMBL1200978, CHEMBL2362016] |[KTTMEOWBIWLMSE-UHFFFAOYSA-N, IKWTVSLWAPBBKU-UHFFFAOYSA-N] | |BENIDIPINE |[CHEMBL2074972, CHEMBL2105555] |[QZVNQOLPLYWLHQ-UHFFFAOYSA-N, QZVNQOLPLYWLHQ-ZEQKJWHPSA-N] | |BESIGLIPTIN TOSYLATE |[CHEMBL4300558, CHEMBL4297430] |[VBMNNLBIPDGFBS-XGXURLCGSA-N, WWRKIQZKFGMESZ-SFUJAXRZSA-N] | |BGT-226 |[CHEMBL3545096, CHEMBL3218578] |[YUXMAKUNSXIEKN-BTJKTKAUSA-N, BMMXYEBLEBULND-UHFFFAOYSA-N] | |BMS-823778 |[CHEMBL4301600, CHEMBL4297457] |[PTIFVLOBVCIMKL-UHFFFAOYSA-N, MQBBSMNIRWXALO-UHFFFAOYSA-N] | |BMS-863233 |[CHEMBL3544944, CHEMBL3544943] |[UNDKJUKLBNARIZ-UHFFFAOYSA-N, JJWLXRKVUJDJKG-UHFFFAOYSA-N] | |BMS-986202 |[CHEMBL4789639, CHEMBL4803710] |[BBRMAVGRWHNAIG-UHFFFAOYSA-N, BBRMAVGRWHNAIG-FIBGUPNXSA-N] | |C-1311 |[CHEMBL3545337, CHEMBL338604] |[XJYNBZQTAZDMHZ-UHFFFAOYSA-N, CUNDRHORZHFPLY-UHFFFAOYSA-N] | |CALCIFEDIOL |[CHEMBL3544909, CHEMBL1040] |[JWUBBDSIWDLEOM-DTOXIADCSA-N, WRLFSJXJGJBFJQ-WPUCQFJDSA-N] | |CANAGLIFLOZIN |[CHEMBL4594217, CHEMBL2048484] |[RCCZPUWDQVUJAB-FVYJGOGTSA-N, XTNGUQKDFGDXSJ-ZXGKGEBGSA-N] | |CARBACHOL |[CHEMBL14, CHEMBL965] |[AIXAANGOTKPUOY-UHFFFAOYSA-N, VPJXQGSRWJZDOB-UHFFFAOYSA-O] | |CARBIDOPA |[CHEMBL1201236, CHEMBL1200748] |[QTAOMKOIBXZKND-PPHPATTJSA-N, TZFNLOMSOLWIDK-JTQLQIEISA-N] | |CEFACLOR |[CHEMBL680, CHEMBL1201018] |[QYIYFLOTGYLRGG-GPCCPHFNSA-N, WKJGTOYAEQDNIA-IOOZKYRYSA-N] | |CEFAMANDOLE NAFATE |[CHEMBL1201218, CHEMBL1618] |[ICZOIXFFVKYXOM-YCLOEFEOSA-M, RRJHESVQVSRQEX-SUYBPPKGSA-N] | |CEFPROZIL |[CHEMBL3301800, CHEMBL3184906] |[WDLWHQDACQUCJR-PBFPGSCMSA-N, ALYUMNAHLSSTOU-PFBPGKLMSA-N] | |CEFTOBIPROLE MEDOCARIL |[CHEMBL4297101, CHEMBL1652606] |[MFAWUGGPPMTWPU-INRVRKGJSA-M, HFTSMHTWUFCYMJ-FDNJTQOMSA-N] | |CENISERTIB |[CHEMBL1967878, CHEMBL1614709] |[KSOVGRCOLZZTPF-UHFFFAOYSA-N, KSOVGRCOLZZTPF-QMKUDKLTSA-N] | |CEPHALEXIN |[CHEMBL1200544, CHEMBL1727] |[AVGYWQBCYZHHPN-CYJZLJNKSA-N, ZAIPMKNFIOOWCQ-UEKVPHQBSA-N] | |CERIUM OXALATE |[CHEMBL3833363, CHEMBL3990854] |[] | |CFI-400945 |[CHEMBL3408947, CHEMBL4297462] |[AQCDFVLWUWJREO-WQVJSASDSA-N, DADASRPKWOGKCU-FVTQAUBDSA-N] | |CHROMIUM PICOLINATE |[CHEMBL3185022, CHEMBL4297201] |[] | |CISPLATIN |[CHEMBL2068237, CHEMBL11359] |[] | |CLOFEZONE |[CHEMBL3833391, CHEMBL3833413] |[ZMNFDNGOKBWIAO-UHFFFAOYSA-N, JNHZIFUHHHJTFM-UHFFFAOYSA-N] | |CLOPENTHIXOL |[CHEMBL3989892, CHEMBL87385] |[WFPIAZLQTJBIFN-BLLMUTORSA-N, WFPIAZLQTJBIFN-UHFFFAOYSA-N] | |CONTEZOLID ACEFOSAMIL |[CHEMBL3989966, CHEMBL4301914] |[YCRAGJLWFBGKFE-CYBMUJFWSA-N, JANNTEAGZXJITO-BTQNPOSSSA-M] | |CP-461 |[CHEMBL4243974, CHEMBL4068108] |[KGXPDNOBLLACKL-BWLGBDCWSA-N, NVCAMOJXQVJSOM-XKZIYDEJSA-N] | |CPP-115 |[CHEMBL146927, CHEMBL4297273] |[CBSRETZPFOBWNG-UCORVYFPSA-N, PGVAKHBXGXBOSB-WINKWTMZSA-N] | |CYCLOPHOSPHAMIDE |[CHEMBL88, CHEMBL1200796] |[CMSMOCZEIVJLDB-UHFFFAOYSA-N, PWOQRKCAHTVFLB-UHFFFAOYSA-N] | |DACOMITINIB |[CHEMBL2105719, CHEMBL2110732] |[LVXJQMNHJWSHET-AATRIKPKSA-N, BSPLGGCPNTZPIH-IPZCTEOASA-N] | |DEUCRAVACITINIB |[CHEMBL4435170, CHEMBL4596392] |[BZZKEPGENYLQSC-UHFFFAOYSA-N, BZZKEPGENYLQSC-FIBGUPNXSA-N] | |DIMESNA |[CHEMBL2009034, CHEMBL2104318] |[KQYGMURBTJPBPQ-UHFFFAOYSA-L, BYUKOOOZTSTOOH-UHFFFAOYSA-N] | |DIPYRONE |[CHEMBL3989803, CHEMBL461522] |[LVWZTYCIRDMTEY-UHFFFAOYSA-N, UNZIDPIPYUMVPA-UHFFFAOYSA-M] | |DOCETAXEL |[CHEMBL3545252, CHEMBL92] |[XCDIRYDKECHIPE-QHEQPUDQSA-N, ZDZOTLJHXYCWBA-VCVYQWHSSA-N] | |EDETATE CALCIUM DISODIUM |[CHEMBL3989522, CHEMBL1200375] |[JHECKPXUCKQCSH-UHFFFAOYSA-J, SHWNNYZBHZIQQV-UHFFFAOYSA-J] | |EDETATE DISODIUM |[CHEMBL1749, CHEMBL3989507] |[ZGTMUACCHSMWAC-UHFFFAOYSA-L, OVBJJZOQPCKUOR-UHFFFAOYSA-L] | |ENALAPRILAT |[CHEMBL3989406, CHEMBL577] |[MZYVOFLIPYDBGD-MLZQUWKJSA-N, LZFZMUMEGBBDTC-QEJZJMRPSA-N] | |EVT-101 |[CHEMBL3545350, CHEMBL3545349] |[OJBLXSPBJMGZDN-UHFFFAOYSA-N, BOVUHBFXPNLTKF-UHFFFAOYSA-N] | |FENOBAM |[CHEMBL2103779, CHEMBL239800] |[DWPQODZAOSWNHB-UHFFFAOYSA-N, UNFQKKSADLVQJE-UHFFFAOYSA-N] | |FERRIC CITRATE |[CHEMBL3991241, CHEMBL3301597] |[] | |FERROUS GLUCONATE |[CHEMBL3833303, CHEMBL3991443] |[] | |GOLD SODIUM THIOSULFATE |[CHEMBL3990248, CHEMBL3833379] |[] | |Hydroxy-Phenyl-Acetic Acid Anion |[CHEMBL58910, CHEMBL292411] |[IWYDHOAUDWTVEP-SSDOTTSWSA-N, IWYDHOAUDWTVEP-ZETCQYMHSA-N] | |IMIPENEM |[CHEMBL148, CHEMBL43708] |[GSOSVVULSKVSLQ-JJVRHELESA-N, ZSKVGTPCRGIANV-ZXFLCMHBSA-N] | |INDOCYANINE GREEN |[CHEMBL1615807, CHEMBL1646, CHEMBL1201304]|[BDBMLMBYCXNVMC-UHFFFAOYSA-N, BDBMLMBYCXNVMC-UHFFFAOYSA-O, MOFVSTNWEDAEEK-UHFFFAOYSA-M]| |INOSINE PRANOBEX |[CHEMBL3833405, CHEMBL3833327] |[JBVWKTQYMFTKMW-MSQVLRTGSA-N, YLDCUKJMEKGGFI-QCSRICIXSA-N] | |ISOSULFAN BLUE |[CHEMBL1201275, CHEMBL1200859] |[YFKDCGWIINMRQY-UHFFFAOYSA-N, NLUFDZBOHMOBOE-UHFFFAOYSA-M] | |ITX-5061 |[CHEMBL1208829, CHEMBL3402567] |[UUROSJLZNDSXRF-UHFFFAOYSA-N, ICIJBYYMEBOTQP-UHFFFAOYSA-N] | |JNJ-17166864 |[CHEMBL4297292, CHEMBL397647] |[BXCVNDLGUWFNBK-UHFFFAOYSA-O, VALRCWBOFWDEDE-UHFFFAOYSA-N] | |JNJ-18038683 |[CHEMBL4297293, CHEMBL4205783] |[UKJPMZGILXATGT-UHFFFAOYSA-N, DIQZMBPDLFAJLK-UHFFFAOYSA-N] | |JNJ-61393215 |[CHEMBL4776719, CHEMBL4802293] |[HUKWIAXQBOHZIX-JEBQAFNWSA-N, HUKWIAXQBOHZIX-USKNZQBOSA-N] | |KW-2478 |[CHEMBL4297298, CHEMBL4300557] |[VFUXSYAXEKYYMB-UHFFFAOYSA-N, CKMGYWHSTADSIG-UHFFFAOYSA-N] | |L-778123 |[CHEMBL279433, CHEMBL4297396] |[JNUGFGAVPBYSHF-UHFFFAOYSA-N, YNBSQYGTJLIPJS-UHFFFAOYSA-N] | |LISINOPRIL |[CHEMBL419213, CHEMBL1237] |[CZRQXSDBMCMPNJ-ZUIPZQNBSA-N, RLAWWYSOJDYHDC-BZSNNMDCSA-N] | |LITHIUM CITRATE |[CHEMBL2103738, CHEMBL1201170] |[WJSIUCDMWSDDCE-UHFFFAOYSA-K, HXGWMCJZLNWEBC-UHFFFAOYSA-K] | |LORACARBEF |[CHEMBL1200610, CHEMBL1013] |[JAPHQRWPEGVNBT-UTUOFQBUSA-N, GPYKKBAAPVOCIW-HSASPSRMSA-N] | |MAGNESIUM CHLORIDE |[CHEMBL2219642, CHEMBL3185229] |[TWRXJAOTZQYOKJ-UHFFFAOYSA-L, DHRRIBDTHFBPNG-UHFFFAOYSA-L] | |MANGANESE CHLORIDE |[CHEMBL1200548, CHEMBL1200693] |[] | |MANGANESE SULFATE |[CHEMBL2103742, CHEMBL1200557] |[] | |MERCAPTOPURINE |[CHEMBL1425, CHEMBL1200751] |[WFFQYWAAEWLHJC-UHFFFAOYSA-N, GLVAUDGFNGKCSF-UHFFFAOYSA-N] | |MESNA |[CHEMBL1098319, CHEMBL975] |[XOGTZOOQQBDUSI-UHFFFAOYSA-M, ZNEWHQLOPFWXOF-UHFFFAOYSA-N] | |METFORMIN XR |[CHEMBL494397, CHEMBL1187231] |[GWWIHNWMWIJWRT-UHFFFAOYSA-N, LJVNRPAERZRHDF-UHFFFAOYSA-N] | |METHYLENE BLUE |[CHEMBL191083, CHEMBL550495] |[RBTBFTRPCNLSDE-UHFFFAOYSA-N, XQAXGZLFSSPBMK-UHFFFAOYSA-M] | |MIFAMURTIDE |[CHEMBL2107354, CHEMBL2111100] |[SARBMGXGWXCXFW-GJHVZSAVSA-M, JMUHBNWAORSSBD-WKYWBUFDSA-N] | |MK-0941 |[CHEMBL3580737, CHEMBL4297302] |[PIDNRTWDGDJKSQ-UQKRIMTDSA-N, KJSGTWFWVTYPFZ-AWEZNQCLSA-N] | |MP-412 |[CHEMBL5095492, CHEMBL5095024] |[JZDDKMMYGMSSFH-UHFFFAOYSA-N, BRZCZOWSDKKGGK-UHFFFAOYSA-N] | |NAVARIXIN |[CHEMBL216981, CHEMBL2103864] |[RXIUEIPPLAFSDF-CYBMUJFWSA-N, AFTCWZSEWTXWTL-BTQNPOSSSA-N] | |NOX-100 |[CHEMBL4297472, CHEMBL4301787] |[IBVJFULICYLKCE-BDVNFPICSA-N, PLQRBFAACWRSKF-LJTMIZJLSA-M] | |OCTINOXATE |[CHEMBL1200608, CHEMBL3183184] |[YBGZDTIWKVFICR-JLHYYAGUSA-N, YBGZDTIWKVFICR-UHFFFAOYSA-N] | |OLEOYL |[CHEMBL189959, CHEMBL191896] |[WRGQSWVCFNIUNZ-MDZDMXLPSA-N, XGRLSUFHELJJAB-RRABGKBLSA-M] | |OUABAIN |[CHEMBL1889436, CHEMBL222863] |[TYBARJRCFHUHSN-DMJRSANLSA-N, LPMXVESGRSUGHW-HBYQJFLCSA-N] | |OXYPHENBUTAZONE |[CHEMBL1228, CHEMBL3989676] |[CNDQSXOVEQXJOE-UHFFFAOYSA-N, HFHZKZSRXITVMK-UHFFFAOYSA-N] | |PAPAVERETUM |[CHEMBL3833334, CHEMBL3833406] |[LGFMXOTUSSVQJV-LYAFVJFSSA-N, IAWXESBTFLIKNX-MMWHECRDSA-N] | |PD-0166285 |[CHEMBL49120, CHEMBL3545196] |[NADLBPWBFGTESN-UHFFFAOYSA-N, IFPPYSWJNWHOLQ-UHFFFAOYSA-N] | |PENTAZOCINE |[CHEMBL100116, CHEMBL560] |[VOKSWYLNZZRQPF-UHFFFAOYSA-N, VOKSWYLNZZRQPF-GDIGMMSISA-N] | |PERAMIVIR |[CHEMBL139367, CHEMBL3989402] |[RFUCJKFZFXNIGB-ZBBHRWOZSA-N, XRQDFNLINLXZLB-CKIKVBCHSA-N] | |PINACIDIL |[CHEMBL1159, CHEMBL1200338] |[AFJCNBBHEVLGCZ-UHFFFAOYSA-N, IVVNZDGDKPTYHK-UHFFFAOYSA-N] | |PIPERAZINE CITRATE |[CHEMBL3990694, CHEMBL3989678] |[JDDHUROHDHPVIO-UHFFFAOYSA-N, LWMBPKJYEQGDLN-UHFFFAOYSA-N] | |PITTSBURGH COMPOUND B |[CHEMBL207456, CHEMBL93124] |[ZQAQXZBSGZUUNL-UHFFFAOYSA-N, ZQAQXZBSGZUUNL-BJUDXGSMSA-N] | |POTASSIUM CITRATE |[CHEMBL1200458, CHEMBL3989822] |[QEEAPRPFLLJWCF-UHFFFAOYSA-K, PJAHUDTUZRZBKM-UHFFFAOYSA-K] | |PRX-08066 |[CHEMBL4297322, CHEMBL513994] |[RPYIKXHIQXRXEM-WLHGVMLRSA-N, IENZFHBNCRQMNP-UHFFFAOYSA-N] | |PYRIDOXAL PHOSPHATE |[CHEMBL82202, CHEMBL3181870] |[NGVDGCNFYWLIFO-UHFFFAOYSA-N, CEEQUQSGVRRXQI-UHFFFAOYSA-N] | |QUINAGOLIDE |[CHEMBL290962, CHEMBL2218861] |[GDFGTRDCCWFXTG-UHFFFAOYSA-N, GDFGTRDCCWFXTG-ZIFCJYIRSA-N] | |R-1487 |[CHEMBL1230122, CHEMBL1766582] |[KKKRKRMVJRHDMG-UHFFFAOYSA-N, RQHSAIGGUWVOBG-UHFFFAOYSA-N] | |R-343 |[CHEMBL2170582, CHEMBL3545340] |[ILIHENOWEIBEQV-UHFFFAOYSA-N, MOXXQFNQDDSJHT-UHFFFAOYSA-N] | |REGADENOSON |[CHEMBL317052, CHEMBL3989695] |[CDQVVPUXSPZONN-WPPLYIOHSA-N, LZPZPHGJDAGEJZ-AKAIJSEGSA-N] | |RELEBACTAM |[CHEMBL3301605, CHEMBL3112741] |[SMOBCLHAZXOKDQ-ZJUUUORDSA-N, TWFRCSHLWKJBQH-UXQCFNEQSA-N] | |RG3487 |[CHEMBL5095032, CHEMBL2151439] |[TXCYUSKWBHUVEP-CYBMUJFWSA-N, CMRLNEYJEPELSM-BTQNPOSSSA-N] | |RITODRINE |[CHEMBL785, CHEMBL83063] |[IOVGROKTTNBUGK-SJCJKPOMSA-N, IOVGROKTTNBUGK-UHFFFAOYSA-N] | |SAR-407899 |[CHEMBL1667969, CHEMBL3545341] |[KMNVOGVCCZNVNU-UHFFFAOYSA-N, IPEXHQGMTHOKQV-UHFFFAOYSA-N] | |SAXAGLIPTIN |[CHEMBL2103745, CHEMBL385517] |[AFNTWHMDBNQQPX-NHKADLRUSA-N, QGJUIPDUBHWZPV-SGTAVMJGSA-N] | |SILVER DIAMMINE FLUORIDE |[CHEMBL4300152, CHEMBL4297206] |[REYHXKZHIMGNSE-UHFFFAOYSA-M, FJKGRAZQBBWYLG-UHFFFAOYSA-M] | |SIMMITECAN |[CHEMBL4301491, CHEMBL4297325] |[UWNITDCAZZJJFF-GXUZKUJRSA-N, XPVBLGRILRVSLF-UMSFTDKQSA-N] | |SITAFLOXACIN |[CHEMBL3989504, CHEMBL108821] |[ANCJYRJLOUSQBW-JJZGMWGRSA-N, PNUZDKCDAWUEGK-CYZMBNFOSA-N] | |SODIUM PROPIONATE |[CHEMBL3989705, CHEMBL500826] |[JXKPEJDQGNYQSM-UHFFFAOYSA-M, HOAUAOBUGFYWMK-UHFFFAOYSA-M] | |SODIUM STIBOGLUCONATE |[CHEMBL3991035, CHEMBL2079699] |[] | |SODIUM SULFATE |[CHEMBL3989856, CHEMBL233406] |[PMZURENOXWZQFD-UHFFFAOYSA-L, RSIJVJUOQBWMIM-UHFFFAOYSA-L] | |SPINOSAD |[CHEMBL4297065, CHEMBL2040681] |[JFLRKDZMHNBDQS-UCQUSYKYSA-N, JFLRKDZMHNBDQS-SGSTVUCESA-N] | |SPIRADOLINE |[CHEMBL118865, CHEMBL70586] |[NYKCGQQJNVPOLU-UHFFFAOYSA-N, NYKCGQQJNVPOLU-ONTIZHBOSA-N] | |SR16234 |[CHEMBL3545211, CHEMBL3545210] |[VOHOCSJONOJOSD-RQIKCBEZSA-N, OHCPNHFLPCVWRG-YQOGLFJVSA-N] | |STREPTODUOCIN |[CHEMBL3833407, CHEMBL3833338] |[AWBXTNNIECFIHT-XZQQZIICSA-N, CPCMMVSHXFLUOU-HHVGSZDJSA-N] | |SUCRALFATE |[CHEMBL2367706, CHEMBL3989780] |[WXOMTJVVIMOXJL-BOBFKVMVSA-A, JTZPPHUZZDKEOC-RBQAPOGLSA-A] | |TACROLIMUS |[CHEMBL269732, CHEMBL3989887] |[NWJQLQGQZSIBAF-MLAUYUEBSA-N, QJJXYPPXXYFBGM-LFZNUXCKSA-N] | |TALC |[CHEMBL3990276, CHEMBL3989756] |[XBPUDTAATCFDRE-UHFFFAOYSA-N, FPAFDBFIGPHWGO-UHFFFAOYSA-N] | |TAS-303 |[CHEMBL5095507, CHEMBL5095197] |[FMRVAGKPRZLTRG-KORWVGAPSA-N, FMRVAGKPRZLTRG-UHFFFAOYSA-N] | |TETRACHLORODECAOXIDE |[CHEMBL4299960, CHEMBL3707387] |[VOWOEBADKMXUBU-UHFFFAOYSA-J, IVFGWWGNVIZDAS-UHFFFAOYSA-N] | |TEZACITABINE |[CHEMBL2105467, CHEMBL3989496] |[GFFXZLZWLOBBLO-ASKVSEFXSA-N, XPYQFIISZQCINN-QVXDJYSKSA-N] | |TEZAMPANEL |[CHEMBL14935, CHEMBL3989703] |[ZXFRFPSZAKNPQQ-YTWAJWBKSA-N, LNDYQNTTYXLTNH-RTBBDAMFSA-N] | |THIAMINE |[CHEMBL1547, CHEMBL1588] |[JZRWCGZRTZMZEH-UHFFFAOYSA-N, MYVIATVLJGTBFV-UHFFFAOYSA-M] | |TOCOPHEROL ACETATE |[CHEMBL1047, CHEMBL3989859] |[ZAKOWWREFLAJOT-ADUHFSDSSA-N, ZAKOWWREFLAJOT-CEFNRUSXSA-N] | |TOFOGLIFLOZIN |[CHEMBL2105711, CHEMBL2110731] |[VWVKUNOPTJGDOB-BDHVOXNPSA-N, ZXOCGDDVNPDRIW-NHFZGCSJSA-N] | |TRANYLCYPROMINE |[CHEMBL3989843, CHEMBL313833] |[IGLYMJRIWWIQQE-QUOODJBBSA-N, AELCINSCMGFISI-UHFFFAOYSA-N] | |U-50488 METHANE SULFONATE |[CHEMBL482811, CHEMBL441765] |[OJPHNZCUXUUVKU-JAXOOIEVSA-N, VQLPLYSROCPWFF-QZTJIDSGSA-N] | |UT-231B |[CHEMBL4297332, CHEMBL4300049] |[DZRQMHSNVNTFAQ-IVGJVWKCSA-N, BBIRATBJZBAXFS-ZOBORPQBSA-N] | |VCH-759 |[CHEMBL1673159, CHEMBL1741078] |[CKVYPTIWERNFSU-UHFFFAOYSA-N, SSERCMQZZYTNBY-UHFFFAOYSA-M] | |VITAMIN E |[CHEMBL3989727, CHEMBL47] |[GVJHHUAWPYXKBD-IEOSBIPESA-N] | |YM-543 |[CHEMBL2397450, CHEMBL4297452] |[AGJJCLBOHJQGFA-ZQGJOIPISA-N, UKOOBSDARBTSHN-NGOMLPPMSA-M] | |ZD-4190 |[CHEMBL3544937, CHEMBL281872] |[YBTGTVGEKMZEQX-UHFFFAOYSA-N, IOFHDGSCWJNVRZ-UHFFFAOYSA-N] | |ZINC ACETATE |[CHEMBL1200928, CHEMBL3184986] |[DJWUNCQRNNEAKC-UHFFFAOYSA-L, BEAZKUGSCHFXIQ-UHFFFAOYSA-L] | +------------------------------------------------+------------------------------------------+---------------------------------------------------------------------------------------+ ```

Expected behaviour The name for CHEMBL1201236 should be carbidopa monohydrate and the name for CHEMBL1200748 should be carbidopa. I expect that a similar review is done for the other 124 names.

Additional context These changes should be made available through ChEMBL. No action on our side.

FionaEBI commented 9 months ago

Hi Irene, Thanks for the comment. Yes, we are aware of these issues - these are typically examples where the salt and parent compounds have been given the same name, often because they are called by the same name in the source document, although careful examination of the any chemical structure displayed within the source information (e.g. drug label) will show that the child structure is the hydrated form or the hydrochloride salt etc, while the parent structure is the anhydrous form etc. It takes a significant amount of manual curation for each individual case to untangle the detail in DrugBase (& ChEMBL) while at the same time accurately reflect the original source documents (which may in some cases have conflicting data...). We have similar issues where one synonym or trade name has been mapped to more than one compound.

However, you will be pleased to know that for the next ChEMBL release we have done a huge amount of effort to improve these duplicate name issues, and duplicate synonyms, so you should see marked improvement for ChEMBL 34.

We can discuss some examples at one of the next monthly meetings.