Closed Yijia-Zhou closed 4 years ago
import pandas as pd
import pymongo
import re
import random
from copy import deepcopy
import requests, json
icd2sct = pd.read_table('Download/Mapping/mapSnomedToIcd.txt')
icd2sct
conceptId | 亚目编码 | mapAdvice | term | 亚目名称 | |
---|---|---|---|---|---|
0 | 240349003 | A00.0 | ALWAYS A00.0 | Cholera due to Vibrio cholerae O1 Classical bi... | 霍乱,由于01群霍乱弧菌,霍乱生物型所致 |
1 | 240349003 | A00.0 | ALWAYS A00.0 | Cholera caused by Vibrio cholerae O1 Classical... | 霍乱,由于01群霍乱弧菌,霍乱生物型所致 |
2 | 240349003 | A00.0 | ALWAYS A00.0 | Cholera caused by Vibrio cholerae O1 Classical... | 霍乱,由于01群霍乱弧菌,霍乱生物型所致 |
3 | 81020007 | A00.1 | ALWAYS A00.1 | Cholera due to Vibrio cholerae El Tor | 霍乱,由于01群霍乱弧菌,埃尔托生物型所致 |
4 | 81020007 | A00.1 | ALWAYS A00.1 | Cholera - Vibrio cholerae O1 El Tor biotype | 霍乱,由于01群霍乱弧菌,埃尔托生物型所致 |
5 | 81020007 | A00.1 | ALWAYS A00.1 | Cholera caused by Vibrio cholerae El Tor | 霍乱,由于01群霍乱弧菌,埃尔托生物型所致 |
6 | 81020007 | A00.1 | ALWAYS A00.1 | Cholera caused by Vibrio cholerae El Tor (diso... | 霍乱,由于01群霍乱弧菌,埃尔托生物型所致 |
7 | 63650001 | A00.9 | ALWAYS A00.9 | Cholera | 未特指的霍乱 |
8 | 63650001 | A00.9 | ALWAYS A00.9 | Cholera (disorder) | 未特指的霍乱 |
9 | 63650001 | A00.9 | ALWAYS A00.9 | Vibrio cholerae infection | 未特指的霍乱 |
10 | 240351004 | A00.9 | ALWAYS A00.9 | Cholera - O139 group Vibrio cholerae | 未特指的霍乱 |
11 | 240351004 | A00.9 | ALWAYS A00.9 | Cholera - O139 group Vibrio cholerae (disorder) | 未特指的霍乱 |
12 | 240351004 | A00.9 | ALWAYS A00.9 | Cholera due to Vibrio cholerae O139 | 未特指的霍乱 |
13 | 446672004 | A00.9 | ALWAYS A00.9 | Intestinal infection due to Vibrio cholerae no... | 未特指的霍乱 |
14 | 446672004 | A00.9 | ALWAYS A00.9 | Intestinal infection caused by Vibrio cholerae... | 未特指的霍乱 |
15 | 446672004 | A00.9 | ALWAYS A00.9 | Intestinal infection caused by Vibrio cholerae... | 未特指的霍乱 |
16 | 240350003 | A00.9 | ALWAYS A00.9 | Cholera - non-agglutinable vibrio | 未特指的霍乱 |
17 | 240350003 | A00.9 | ALWAYS A00.9 | Cholera - non-O1 group vibrio | 未特指的霍乱 |
18 | 240350003 | A00.9 | ALWAYS A00.9 | Cholera - non-O1 group vibrio (disorder) | 未特指的霍乱 |
19 | 447282003 | A00.9 | ALWAYS A00.9 | Intestinal infection due to Vibrio cholerae O1 | 未特指的霍乱 |
20 | 447282003 | A00.9 | ALWAYS A00.9 | Intestinal infection caused by Vibrio cholerae... | 未特指的霍乱 |
21 | 447282003 | A00.9 | ALWAYS A00.9 | Intestinal infection caused by Vibrio cholerae O1 | 未特指的霍乱 |
22 | 1084791000119106 | A01.0 | ALWAYS A01.0 | Cardiac disorder due to typhoid fever (disorder) | 伤寒 |
23 | 1084791000119106 | A01.0 | ALWAYS A01.0 | Cardiac disorder due to typhoid fever | 伤寒 |
24 | 192648008 | A01.0 | ALWAYS A01.0 | Meningitis due to typhoid fever | 伤寒 |
25 | 192648008 | A01.0 | ALWAYS A01.0 | Meningitis caused by typhoid fever | 伤寒 |
26 | 192648008 | A01.0 | ALWAYS A01.0 | Meningitis caused by typhoid fever (disorder) | 伤寒 |
27 | 402963009 | A01.0 | ALWAYS A01.0 | Typhoid exanthem (disorder) | 伤寒 |
28 | 402963009 | A01.0 | ALWAYS A01.0 | Typhoid exanthem | 伤寒 |
29 | 402963009 | A01.0 | ALWAYS A01.0 | Rose spots in salmonellosis | 伤寒 |
... | ... | ... | ... | ... | ... |
241911 | 443596009 | Z99.2 | ALWAYS Z99.2 | Dependence on peritoneal dialysis | 依赖肾透析 |
241912 | 442566005 | Z99.2 | ALWAYS Z99.2 | Surgically constructed radioulnar arteriovenou... | 依赖肾透析 |
241913 | 442566005 | Z99.2 | ALWAYS Z99.2 | Surgically constructed radioulnar arteriovenou... | 依赖肾透析 |
241914 | 11000731000119102 | Z99.2 | ALWAYS Z99.2 | Dependence on continuous ambulatory peritoneal... | 依赖肾透析 |
241915 | 11000731000119102 | Z99.2 | ALWAYS Z99.2 | Dependence on continuous ambulatory peritoneal... | 依赖肾透析 |
241916 | 105503008 | Z99.3 | ALWAYS Z99.3 | Dependence on wheelchair (finding) | 依赖轮椅 |
241917 | 105503008 | Z99.3 | ALWAYS Z99.3 | Dependence on wheelchair | 依赖轮椅 |
241918 | 371818002 | Z99.8 | ALWAYS Z99.8 | Patient on intra-aortic balloon pump assist (f... | 依赖其他可启动机器和装置 |
241919 | 371818002 | Z99.8 | ALWAYS Z99.8 | Patient on intra-aortic balloon pump assist | 依赖其他可启动机器和装置 |
241920 | 371819005 | Z99.8 | ALWAYS Z99.8 | Patient on circulatory assist (finding) | 依赖其他可启动机器和装置 |
241921 | 371819005 | Z99.8 | ALWAYS Z99.8 | Patient on circulatory assist | 依赖其他可启动机器和装置 |
241922 | 341000124104 | Z99.8 | ALWAYS Z99.8 | Peritoneal dialysis finding | 依赖其他可启动机器和装置 |
241923 | 341000124104 | Z99.8 | ALWAYS Z99.8 | Peritoneal dialysis finding (finding) | 依赖其他可启动机器和装置 |
241924 | 429091008 | Z99.8 | ALWAYS Z99.8 | Dependence on biphasic positive airway pressur... | 依赖其他可启动机器和装置 |
241925 | 429091008 | Z99.8 | ALWAYS Z99.8 | Dependence on biphasic positive airway pressur... | 依赖其他可启动机器和装置 |
241926 | 429091008 | Z99.8 | ALWAYS Z99.8 | Dependence on biphasic positive airway pressur... | 依赖其他可启动机器和装置 |
241927 | 89241000119108 | Z99.8 | ALWAYS Z99.8 | Dependence on nocturnal oxygen therapy | 依赖其他可启动机器和装置 |
241928 | 89241000119108 | Z99.8 | ALWAYS Z99.8 | Dependence on nocturnal oxygen therapy (finding) | 依赖其他可启动机器和装置 |
241929 | 716366009 | Z99.8 | ALWAYS Z99.8 | Requires continuous home oxygen supply (finding) | 依赖其他可启动机器和装置 |
241930 | 716366009 | Z99.8 | ALWAYS Z99.8 | Requires continuous home oxygen supply | 依赖其他可启动机器和装置 |
241931 | 89201000119106 | Z99.8 | ALWAYS Z99.8 | Dependence on supplemental oxygen when ambulating | 依赖其他可启动机器和装置 |
241932 | 89201000119106 | Z99.8 | ALWAYS Z99.8 | Dependence on supplemental oxygen when ambulat... | 依赖其他可启动机器和装置 |
241933 | 931000119107 | Z99.8 | ALWAYS Z99.8 | Dependence on supplemental oxygen | 依赖其他可启动机器和装置 |
241934 | 931000119107 | Z99.8 | ALWAYS Z99.8 | Dependence on supplemental oxygen (finding) | 依赖其他可启动机器和装置 |
241935 | 713655003 | Z99.8 | ALWAYS Z99.8 | Dependence on non-invasive ventilation | 依赖其他可启动机器和装置 |
241936 | 713655003 | Z99.8 | ALWAYS Z99.8 | Dependence on non-invasive ventilation (finding) | 依赖其他可启动机器和装置 |
241937 | 105501005 | Z99.8 | ALWAYS Z99.8 | Dependence on enabling machine or device (find... | 依赖其他可启动机器和装置 |
241938 | 105501005 | Z99.8 | ALWAYS Z99.8 | Dependence on enabling machine or device | 依赖其他可启动机器和装置 |
241939 | 60651000119103 | Z99.8 | ALWAYS Z99.8 | Dependence on continuous supplemental oxygen (... | 依赖其他可启动机器和装置 |
241940 | 60651000119103 | Z99.8 | ALWAYS Z99.8 | Dependence on continuous supplemental oxygen | 依赖其他可启动机器和装置 |
241941 rows × 5 columns
for line in list(icd2sct['conceptId']):
assert type(line) == int
rela = pd.read_table('Download/SnomedCT_InternationalRF2_PRODUCTION_20190731T120000Z/Delta/Terminology/sct2_Relationship_Delta_INT_20190731.txt')
rela = rela[rela['active']==1]
client = pymongo.MongoClient(host='localhost', port=27017)
db = client.sym_info.sym_info
sym_sctid_list = [re.findall(r'sctid-(\d+)', item['bmesh'])[0] for item in db.find() if len(re.findall(r'sctid-(\d+)', item['bmesh']))>0]
set(icd2sct['conceptId']).intersection(set(sym_sctid_list))
set()
icd_id_set = set(icd2sct['conceptId'])
sym_id_set = set(sym_sctid_list)
descri = pd.read_table('Download/SnomedCT_InternationalRF2_PRODUCTION_20190731T120000Z/Delta/Terminology/sct2_Description_Delta-en_INT_20190731.txt')
descri = descri[descri['active']==1]
for row in rela.iterrows():
sctids = tuple(row[1][['sourceId', 'destinationId']])
if sctids[0] in icd_id_set and len(descri[descri['conceptId']==sctids[1]]) != 0:
print('sctids[0] in icd_id_set')
print(icd2sct[icd2sct['conceptId']==sctids[0]])
print('-----------------------------')
print(descri[descri['conceptId']==sctids[1]])
print(' *****************************\n *****************************\n *****************************\n')
break
elif sctids[1] in icd_id_set and len(descri[descri['conceptId']==sctids[0]]) != 0:
print('sctids[1] in icd_id_set')
print(icd2sct[icd2sct['conceptId']==sctids[1]])
print('-----------------------------')
print(descri[descri['conceptId']==sctids[0]])
print(' *****************************\n *****************************\n *****************************\n')
sctids[1] in icd_id_set
conceptId 亚目编码 mapAdvice \
195302 129135003 T11.3 ALWAYS T11.3 | POSSIBLE REQUIREMENT FOR AN EXT...
195303 129135003 T11.3 ALWAYS T11.3 | POSSIBLE REQUIREMENT FOR AN EXT...
term 亚目名称
195302 Injury of nerve of upper extremity 上肢未特指神经的损伤,水平未特指
195303 Injury of nerve of upper extremity (disorder) 上肢未特指神经的损伤,水平未特指
-----------------------------
id effectiveTime active moduleId conceptId \
12385 3767137017 20190731 1 900000000000207008 212295006
12386 3767138010 20190731 1 900000000000207008 212295006
languageCode typeId \
12385 en 900000000000003001
12386 en 900000000000013009
term caseSignificanceId
12385 Injury of intercostobrachial nerve (disorder) 900000000000448009
12386 Injury of intercostobrachial nerve 900000000000448009
*****************************
*****************************
*****************************
sctids[0] in icd_id_set
conceptId 亚目编码 mapAdvice term \
167541 232354002 R04.0 ALWAYS R04.0 Anterior epistaxis
167542 232354002 R04.0 ALWAYS R04.0 Epistaxis from Kiesselbach's plexus
167543 232354002 R04.0 ALWAYS R04.0 Epistaxis from anterior nasal septum
167544 232354002 R04.0 ALWAYS R04.0 Epistaxis from Little's area
167545 232354002 R04.0 ALWAYS R04.0 Anterior epistaxis (disorder)
亚目名称
167541 鼻出血
167542 鼻出血
167543 鼻出血
167544 鼻出血
167545 鼻出血
-----------------------------
id effectiveTime active moduleId conceptId \
17352 3773980012 20190731 1 900000000000207008 95433000
17353 3773981011 20190731 1 900000000000207008 95433000
languageCode typeId term \
17352 en 900000000000003001 Disorder of nasal septum (disorder)
17353 en 900000000000013009 Disease of nasal septum
caseSignificanceId
17352 900000000000448009
17353 900000000000448009
*****************************
*****************************
*****************************
data = []
for row in rela.iterrows():
row_result = {}
sctids = tuple(row[1][['sourceId', 'destinationId']])
if sctids[0] in icd_id_set and len(descri[descri['conceptId']==sctids[1]]) != 0:
icd_result = icd2sct[icd2sct['conceptId']==sctids[0]]
row_result['icd_id'] = list(icd_result['亚目编码'])[0]
row_result['icd_name'] = list(icd_result['亚目名称'])[0]
row_result['icd_where'] = 'source'
row_result['another_sctid'] = deepcopy(sctids[1])
descri_result = descri[descri['conceptId']==sctids[1]]
row_result['another_sct_term'] = list(descri_result['term'])[0]
row_result['relation_type_id'] = row[1]['typeId']
data.append(deepcopy(row_result))
elif sctids[1] in icd_id_set and len(descri[descri['conceptId']==sctids[0]]) != 0:
icd_result = icd2sct[icd2sct['conceptId']==sctids[1]]
row_result['icd_id'] = list(icd_result['亚目编码'])[0]
row_result['icd_name'] = list(icd_result['亚目名称'])[0]
row_result['icd_where'] = 'destination'
row_result['another_sctid'] = deepcopy(sctids[0])
descri_result = descri[descri['conceptId']==sctids[0]]
row_result['another_sct_term'] = list(descri_result['term'])[0]
row_result['relation_type_id'] = row[1]['typeId']
data.append(deepcopy(row_result))
data_df = pd.DataFrame(data)[['icd_id', 'icd_name', 'icd_where', 'another_sctid', 'another_sct_term', 'relation_type_id', ]]
def caiyunQuery(q):
url = "http://api.interpreter.caiyunai.com/v1/translator"
token = "mql8jp2ciiq0mqjqbrmo"
payload = {
"source" : q,
"trans_type" : "en2zh",
"request_id" : "demo",
}
headers = {
'content-type': "application/json",
'x-authorization': "token " + token,
}
response = requests.request("POST", url, data=json.dumps(payload), headers=headers)
#print(response.text[:2500])
return json.loads(response.text)
caiyunResult = []
for i in range(4604//50+1):
temp = list(data_df['another_sct_term'])
res = caiyunQuery(temp[i*50:(i+1)*50])
caiyunResult += res['target']
assert len(caiyunResult) == 4605
data_df['another_sct_term_cn'] = caiyunResult
data_df
icd_id | icd_name | icd_where | another_sctid | another_sct_term | relation_type_id | another_sct_term_cn | |
---|---|---|---|---|---|---|---|
0 | T11.3 | 上肢未特指神经的损伤,水平未特指 | destination | 212295006 | Injury of intercostobrachial nerve (disorder) | 116680003 | 肋间神经损伤(失调) |
1 | R04.0 | 鼻出血 | source | 95433000 | Disorder of nasal septum (disorder) | 116680003 | 鼻中隔紊乱 |
2 | Q31.9 | 喉未特指的先天性畸形 | destination | 232461002 | Congenital fissure of larynx | 116680003 | 先天性喉裂 |
3 | K26.9 | 十二指肠溃疡未特指为急性或慢性,不伴有出血或穿孔 | source | 196652006 | Acute peptic ulcer of duodenum | 116680003 | 急性十二指肠消化性溃疡 |
4 | L98.9 | 皮肤和皮下组织未特指的疾患 | destination | 238561006 | Chronic vesicular eczema of foot | 116680003 | 足部慢性水疱性湿疹 |
5 | G71.0 | 肌营养不良 | source | 240046001 | Limb-girdle muscular dystrophy | 116680003 | 肢节型肌营养不良症 |
6 | S09.9 | 头部未特指的损伤 | source | 95433000 | Disorder of nasal septum (disorder) | 116680003 | 鼻中隔紊乱 |
7 | T94.1 | 损伤后遗症,未按身体部位特指者 | destination | 285236005 | Disorder due to and following injury of upper ... | 116680003 | 上肢损伤(障碍)所致或继发的功能障碍 |
8 | D10.1 | 舌良性肿瘤 | source | 91975001 | Benign neoplasm of body of tongue | 116680003 | 舌体良性肿瘤 |
9 | D38.5 | 其他呼吸器官动态未定或动态未知的肿瘤 | source | 95433000 | Disorder of nasal septum (disorder) | 116680003 | 鼻中隔紊乱 |
10 | E16.9 | 胰腺内分泌未特指的疾患 | destination | 126864006 | Neoplasm of endocrine pancreas (disorder) | 116680003 | 内分泌胰腺肿瘤(病变) |
11 | E14.8 | NaN | destination | 127014009 | Peripheral angiopathy due to diabetes mellitus... | 116680003 | 糖尿病性周围血管病变 |
12 | K27.7 | 部位未特指的消化性溃疡慢性,不伴有出血或穿孔 | destination | 128286008 | Chronic peptic ulcer of duodenum | 116680003 | 慢性十二指肠消化性溃疡 |
13 | T13.3 | 下肢未特指神经的损伤,水平未特指 | source | 73590005 | Injury of peripheral nerve (disorder) | 116680003 | 周围神经损伤(病变) |
14 | H44.8 | 眼球的其他疾患 | destination | 13937002 | Subretinal hemorrhage (disorder) | 116680003 | 视网膜下出血 |
15 | N48.1 | 龟头包皮炎\t | destination | 360380003 | Candidal balanoposthitis | 116680003 | 念珠菌性龟头皮炎 |
16 | B37.4 | 其他泌尿生殖系部位的念珠菌病 | destination | 360380003 | Candidal balanoposthitis | 116680003 | 念珠菌性龟头皮炎 |
17 | H40.5 | 继发于其他眼部疾患的青光眼 | destination | 15374009 | Secondary glaucoma due to aphakia (disorder) | 116680003 | 无晶状体眼继发性青光眼 |
18 | T94.1 | 损伤后遗症,未按身体部位特指者 | destination | 21835004 | Sequela of burn | 116680003 | 烧伤后遗症 |
19 | Q77.0 | 软骨成长不全 | source | 105986008 | Osteochondrodysplasia | 116680003 | 骨软骨发育不全 |
20 | J34.0 | 鼻的脓肿、疖和痈 | source | 95433000 | Disorder of nasal septum (disorder) | 116680003 | 鼻中隔紊乱 |
21 | H35.9 | 视网膜未特指的疾患 | destination | 28998008 | Retinal hemorrhage (disorder) | 116680003 | 视网膜出血(障碍) |
22 | H44.8 | 眼球的其他疾患 | destination | 28998008 | Retinal hemorrhage (disorder) | 116680003 | 视网膜出血(障碍) |
23 | S30.2 | 外生殖器挫伤 | source | 211495007 | Contusion of genital organ (disorder) | 116680003 | 生殖器挫伤(紊乱) |
24 | J34.0 | 鼻的脓肿、疖和痈 | source | 95433000 | Disorder of nasal septum (disorder) | 116680003 | 鼻中隔紊乱 |
25 | Q78.2 | 骨硬化症 | source | 105986008 | Osteochondrodysplasia | 116680003 | 骨软骨发育不全 |
26 | K27.9 | 部位未特指的消化性溃疡未特指为急性或慢性,不伴有出血或穿孔 | destination | 51868009 | Ulcer of duodenum (disorder) | 116680003 | 十二指肠溃疡 |
27 | Q04.9 | 脑未特指的先天性畸形 | destination | 55999004 | Cephalocele | 116680003 | 脑膨出 |
28 | L03.3 | 躯干蜂窝织炎 | source | 266579006 | Mastitis | 116680003 | 乳腺炎 |
29 | K27.7 | 部位未特指的消化性溃疡慢性,不伴有出血或穿孔 | destination | 95530000 | Chronic peptic ulcer of stomach | 116680003 | 慢性消化性溃疡 |
... | ... | ... | ... | ... | ... | ... | ... |
4575 | Q74.8 | 四肢的其他特指先天性畸形 | source | 785818007 | Structure of joint region | 363698007 | 节理区结构 |
4576 | B02.3 | 带状疱疹眼病 | source | 785832009 | Structure of ophthalmic nerve or left eye | 363698007 | 眼神经或左眼的结构 |
4577 | B02.3 | 带状疱疹眼病 | source | 785833004 | Structure of ophthalmic nerve of right eye (bo... | 363698007 | 右眼眼神经结构(身体结构) |
4578 | B02.3 | 带状疱疹眼病 | source | 38907003 | Chickenpox | 255234002 | 水痘 |
4579 | L90.5 | 皮肤瘢痕情况和纤维化 | destination | 335901000119105 | Cicatricial entropion of right eyelid (disorder) | 42752001 | 瘢痕性右眼睑内翻 |
4580 | M77.1 | 外上髁炎 | source | 783709000 | Structure of enthesis of left elbow region | 363698007 | 左肘区的结构 |
4581 | P07.3 | 其他早产婴儿 | destination | 343781000119104 | Retinopathy of prematurity of bilateral eyes s... | 42752001 | 双眼早产儿视网膜病变0 |
4582 | J63.4 | 铁沉着病 | source | 785340007 | Inhalation of substance | 42752001 | 吸入物质 |
4583 | J62.8 | 其他含硅[矽]粉尘引起的肺尘埃沉着病 | source | 785340007 | Inhalation of substance | 42752001 | 吸入物质 |
4584 | H18.4 | 角膜变性 | source | 785888000 | Structure of peripheral cornea of right eye (b... | 363698007 | 右眼外周角膜结构(体结构) |
4585 | H18.4 | 角膜变性 | source | 785887005 | Structure of peripheral cornea of left eye (bo... | 363698007 | 左眼外周角膜结构(体结构) |
4586 | T14.1 | 身体未特指部位的开放性伤口 | destination | 286613000 | Scar following wound (disorder) | 255234002 | 伤后瘢痕(紊乱) |
4587 | Y88.0 | 在治疗中使用药物、药剂和生物制品引起有害效应的后遗症\t | source | 275385007 | Poisoning caused by biological substance | 42752001 | 生物性物质中毒 |
4588 | Y88.0 | 在治疗中使用药物、药剂和生物制品引起有害效应的后遗症\t | source | 275385007 | Poisoning caused by biological substance | 255234002 | 生物性物质中毒 |
4589 | R29.8 | 累及神经和肌肉骨骼系统其他和未特指的症状和体征 | source | 786850008 | Structure of toe joint region (body structure) | 363698007 | 趾关节区结构(身体结构) |
4590 | T18.1 | 食管内异物 | destination | 217808004 | Respiratory obstruction due to foreign body in... | 42752001 | 食管异物致呼吸阻塞 |
4591 | T18.1 | 食管内异物 | destination | 217809007 | Respiratory compression due to foreign body in... | 42752001 | 食管异物引起的呼吸道压迫 |
4592 | I67.0 | 大脑动脉夹层形成,未破裂 | destination | 783707003 | Cerebral aneurysm due to dissection of cerebra... | 42752001 | 因脑动脉剥离引起的脑内的动脉瘤 |
4593 | L29.2 | 外阴瘙痒(症) | source | 45292006 | External female genital structure | 363698007 | 外部女性生殖器结构 |
4594 | T86.8 | 其他移植器官和组织的失败和排斥 | source | 785829006 | Structure of transplanted cornea of right eye | 363698007 | 右眼移植角膜的结构 |
4595 | R78.3 | 血中发现致幻剂 | source | 785673007 | Measurement of level of substance in blood (pr... | 363714003 | 血液中物质含量的测定(程序) |
4596 | R26.8 | 其他和未特指的异常的步态和移动 | source | 787040000 | Structure of finger joint region | 363698007 | 指关节区结构 |
4597 | R29.8 | 累及神经和肌肉骨骼系统其他和未特指的症状和体征 | source | 787040000 | Structure of finger joint region | 363698007 | 指关节区结构 |
4598 | R73.0 | 葡萄糖耐量试验异常 | source | 782964007 | Genetic disease | 47429007 | 遗传疾病 |
4599 | H33.0 | 视网膜脱离伴视网膜断裂 | source | 785884003 | Retinal tear of right eye (disorder) | 42752001 | 右眼视网膜裂孔(病变) |
4600 | H26.4 | 后发性白内障 | source | 766834007 | After-cataract | 116680003 | 后发性白内障 |
4601 | K57.2 | 大肠憩室病伴有穿孔和脓肿 | source | 14742008 | Structure of large intestine (body structure) | 363698007 | 大肠结构(身体结构) |
4602 | H26.4 | 后发性白内障 | source | 766834007 | After-cataract | 116680003 | 后发性白内障 |
4603 | H26.4 | 后发性白内障 | source | 766834007 | After-cataract | 116680003 | 后发性白内障 |
4604 | K57.1 | 小肠憩室病不伴有穿孔或脓肿 | source | 30315005 | Structure of small intestine (body structure) | 363698007 | 小肠结构(体结构) |
4605 rows × 7 columns
with open('snomed_relationship_type.txt') as fget:
lines = [line.strip() for line in fget.readlines()]
relationship_type = [re.search(pattern=r'(\d*)\ \|(.*)\|', string=line).groups() for line in lines]
relationship_type = {pair[0]: pair[1] for pair in relationship_type}
def look_type(l, d=relationship_type):
return [d[str(item)] for item in l]
data_df['relationship_type'] = look_type(data_df['relation_type_id'])
output = data_df[['icd_id','icd_name','another_sctid','another_sct_term_cn','another_sct_term','relationship_type','icd_where',]]
大多数是分类/部位关系,'due to' 'after' 潜在可用
具体说明可查看 Snomed_relationship_analysis 目录下的 README.md
关系
Snomed CT Release 中的 Delta/Terminology/sct2_Relationship_Delta_INT_20190731.txt
Snomed CT Term Description
Snomed CT Release 中的Delta/Terminology/sct2_Description_Delta-en_INT_20190731.txt
ICD-10 与 Snomed CT 的映射关系
Mapping(Shannon提交) 中的 mapSnomedToIcd.txt
后台现有症状数据
@fushang318 提供的数据库 dump 文件
见 snomed_relationship.ipynb
(关联的两个概念中,一个有对应的ICD编码,另一个可以在 Snomed CT 中查到 Description)的关系
共 4605 条,见 [Snomed CT 中疾病相关关系(仅包含有term可查的)](./Snomed CT 中疾病相关关系(仅包含有term可查的).tsv)。各列名含义见下表:
列名 | 内容 |
---|---|
icd_id | ICD 编码(亚目,四位) |
icd_name | ICD 亚目名称 |
another_sctid | 另一个 concept 的 sctid |
another_sct_term_cn | 另一个 concept 的名称(使用彩云译为中文) |
another_sct_term | 另一个 concept 的名称 |
relationship_type | 关系类型 |
icd_where | ICD 编码是关系的起始或目标(source/destination) |
各种关系类型数量统计如下:
type | count |
---|---|
Interprets | 114 |
Associated with | 14 |
Associated morphology | 309 |
Associated procedure | 14 |
Has focus | 13 |
Causative agent | 87 |
Associated finding | 51 |
Due to | 214 |
After | 76 |
Is a (attribute) | 3047 |
Finding site | 666 |
两个概念均有对应的ICD编码 的关系
共 7132 条,见 [Snomed CT 中 ICD_terms 相关关系](./Snomed CT 中 ICD_terms 相关关系.tsv),各种关系类型数量统计如下:
ty | count |
---|---|
Associated with | 243 |
After | 231 |
Due to | 981 |
Is a (attribute) | 5633 |
Associated finding | 44 |
开始做这个分析的时候是打算寻找疾病和症状之间的关系,试图找到各疾病的相关症状实现疾病的自动化询问、记录,但发现 Snomed 的关系库中没有【 ICD 收录的疾病 <-> 我们的数据库中已有症状 】的关系。后来想到可以找找【 ICD 收录的疾病 <-> ICD 收录的疾病 】的关系,发现确实有不少,其中 'Associated with', 'After', 'Due to', 'Associated finding' 等都具有潜在的应用价值(比如开始的时候可以都算作并发症)。以下是一些关系例子:
source_icd_id | source_icd_name | destination_icd_id | destination_icd_name | relationship_type | relationship_id |
---|---|---|---|---|---|
G12.2 | 运动神经元病 | D48.9 | 未特指的动态未定或动态未知的肿瘤 | Due to | 2571445020 |
G25.3 | 肌阵挛 | G93.1 | 缺氧性脑损害,不可归类在他处者 | Due to | 2571452022 |
E53.9 | 未特指的维生素B缺乏病 | E56.9 | 未特指的维生素缺乏病 | Due to | 2571513029 |
H53.0 | 失用性弱视 | H52.3 | 屈光参差和影像不等 | Due to | 2572883025 |
Q85.1 | 结节性硬化症 | Q85.9 | 未特指的斑痣性错构瘤病 | Due to | 2573623021 |
D51.1 | 选择性维生素B12吸收不良伴有蛋白尿引起的维生素B12缺乏性贫血 | E53.8 | 其他特指的B族维生素缺乏病 | Due to | 2573993029 |
E85.4 | 限定于器官的淀粉样变 | E85.9 | 未特指的淀粉样变 | Due to | 2574729020 |
E01.1 | 碘缺乏相关性多结节性(地方性)甲状腺肿 | E63.9 | 未特指的营养缺乏 | Due to | 2575179022 |
I87.8 | 静脉其他特指的疾患 | I87.2 | 静脉功能不全(慢性)(周围性) | Due to | 2575378024 |
D53.8 | 其他特指的营养性贫血 | E61.0 | 铜缺乏 | Due to | 2576479020 |
D55.9 | 未特指的酶代谢紊乱性贫血 | E88.9 | 未特指的代谢紊乱 | Due to | 2576480023 |
D55.8 | 其他酶代谢紊乱性贫血 | E88.9 | 未特指的代谢紊乱 | Due to | 2576481022 |
D55.1 | 其他谷胱甘肽代谢紊乱性贫血 | E88.0 | 血浆蛋白代谢紊乱,不可归类在他处者 | Due to | 2576482026 |
E01.2 | 未特指的碘缺乏相关性(地方性)甲状腺肿 | E63.9 | 未特指的营养缺乏 | Due to | 2580207023 |
I83.1 | 下肢静脉曲张伴有炎症 | I87.8 | 静脉其他特指的疾患 | Due to | 2580326028 |
B90.9 | 呼吸道结核和未特指结核的后遗症 | A16.9 | 未特指的呼吸道结核,未提及细菌学或组织学的证实 | After | 2568360028 |
B90.0 | 中枢神经系统结核的后遗症 | A16.9 | 未特指的呼吸道结核,未提及细菌学或组织学的证实 | After | 2568361029 |
B90.1 | 泌尿生殖系结核的后遗症 | A16.9 | 未特指的呼吸道结核,未提及细菌学或组织学的证实 | After | 2568362020 |
B90.2 | 骨和关节结核的后遗症 | A16.9 | 未特指的呼吸道结核,未提及细菌学或组织学的证实 | After | 2568363026 |
B94.2 | 病毒性肝炎的后遗症 | B19.9 | 未特指的病毒性肝炎,不伴有肝昏迷 | After | 2568365022 |
E64.9 | 未特指的营养缺乏后遗症 | E63.9 | 未特指的营养缺乏 | After | 2568495029 |
E64.1 | 维生素A缺乏后遗症 | E50.9 | 未特指的维生素A缺乏病 | After | 2568497021 |
I01.2 | 急性风湿性心肌炎 | A49.1 | 未特指部位的链球菌感染 | After | 2568652020 |
I09.0 | 风湿性心肌炎 | A49.1 | 未特指部位的链球菌感染 | After | 2568653026 |
I69.8 | 其他和未特指的脑血管病后遗症 | I67.9 | 未特指的脑血管病 | After | 2568661020 |
I69.0 | 蛛网膜下出血后遗症 | I60.9 | 未特指的蛛网膜下出血 | After | 2568662029 |
T90.0 | 头部浅表损伤后遗症 | S00.9 | 头部部位未特指的浅表损伤 | After | 2569509022 |
T92.0 | 上肢开放性伤口后遗症 | T11.1 | 上肢开放性伤口,水平未特指 | After | 2569512020 |
T92.5 | 上肢肌肉和肌腱损伤后遗症 | T11.9 | 上肢未特指的损伤,水平未特指 | After | 2569515022 |
T92.6 | 上肢挤压伤和创伤性切断后遗症 | T11.9 | 上肢未特指的损伤,水平未特指 | After | 2569516023 |
T93.9 | 下肢未特指损伤的后遗症 | T13.9 | 下肢未特指的损伤,水平未特指 | After | 2569517025 |
T93.0 | 下肢开放性伤口后遗症 | T13.1 | 下肢开放性伤口,水平未特指 | After | 2569518024 |
T93.5 | 下肢肌肉和肌腱损伤后遗症 | T13.9 | 下肢未特指的损伤,水平未特指 | After | 2569520022 |
T93.6 | 下肢挤压伤和创伤性切断后遗症 | T13.9 | 下肢未特指的损伤,水平未特指 | After | 2569521021 |
K08.8 | 其他特指的牙及支持结构疾患 | K00.9 | 未特指的牙发育疾患 | After | 2570753021 |
O35.0 | 为胎儿(可疑)中枢神经系统畸形给予的孕产妇医疗\t | Q03.9 | 未特指的先天性脑积水 | Associated finding | 3419499021 |
O35.0 | 为胎儿(可疑)中枢神经系统畸形给予的孕产妇医疗\t | Q05.9 | 未特指的脊柱裂 | Associated finding | 3437473025 |
Z80.4 | 生殖器官恶性肿瘤家族史 | D40.9 | 男性生殖器官未特指的动态未定或动态未知的肿瘤 | Associated finding | 4739112028 |
Z85.4 | 生殖器官恶性肿瘤个人史 | C63.9 | 未特指的男性生殖器官恶性肿瘤 | Associated finding | 4742430027 |
Z92.4 | 大手术个人史,不可归类在他处者 | Q24.9 | 未特指的先天性心脏畸形 | Associated finding | 4776861021 |
Z87.5 | 妊娠、分娩和产褥期并发症个人史 | P08.0 | 特大婴儿 | Associated finding | 6490664023 |
Z86.6 | 神经系统和感觉器官疾病个人史 | H50.9 | 未特指的斜视 | Associated finding | 11225968026 |
Z86.2 | 血液和造血器官疾病和某些涉及免疫机制的疾患个人史 | D69.6 | 未特指的血小板减少 | Associated finding | 11226121026 |
Z86.0 | 其他肿瘤个人史 | D36.1 | 周围神经和自主神经系统良性肿瘤 | Associated finding | 11418430021 |
Z86.6 | 神经系统和感觉器官疾病个人史 | H47.0 | 视神经疾患,不可归类在他处者 | Associated finding | 11433832029 |
Z84.8 | 其他特指情况家族史 | O75.2 | 产程期间发热,不可归类在他处者\t | Associated finding | 11438634029 |
Z86.6 | 神经系统和感觉器官疾病个人史 | H43.8 | 玻璃体的其他疾患 | Associated finding | 11438719027 |
Z86.7 | 循环系统疾病个人史 | I67.1 | 脑动脉瘤,未破裂 | Associated finding | 11533854020 |
Z86.7 | 循环系统疾病个人史 | I62.9 | 未特指的颅内出血(非创伤性) | Associated finding | 11533865026 |
Z86.7 | 循环系统疾病个人史 | I74.9 | 未特指动脉的栓塞和血栓形成 | Associated finding | 11533877020 |
Z82.4 | 缺血性心脏病和其他循环系统疾病家族史 | I63.9 | 未特指的脑梗死 | Associated finding | 11533896024 |
Z87.1 | 消化系统疾病个人史 | K22.2 | 食管梗阻 | Associated finding | 11533904029 |
另外如果我们以后采用了 ICD9-CM3 衍生的治疗编码方案,则 'Associated procedure' 也是一个可以挖掘的方向。
申请开发 deadline: 2020-05-20 size:2 @bqx619
snomed_relationship_type.txt