wanghaisheng / OHDSI-Research

对OHDSI的研究
12 stars 0 forks source link

官方wiki—CDM数据模型 #4

Open wanghaisheng opened 9 years ago

wanghaisheng commented 9 years ago

OMOP上的常见查询示例 普通表的 http://cdmqueries.omop.org 针对术语的 http://vocabqueries.omop.org/home

wanghaisheng commented 9 years ago

OpenFurther

Common Data Model

License
Background
The Role of the Common Data Model
    Design Principles
    Details of the Model
    Data Model Conventions
    Changes from Version 4.0
    Glossary of Terms
Standardized Vocabularies
    CONCEPT
    VOCABULARY
    DOMAIN
    CONCEPT_CLASS
    CONCEPT_RELATIONSHIP
    RELATIONSHIP
    CONCEPT_SYNONYM
    CONCEPT_ANCESTOR
    SOURCE_TO_CONCEPT_MAP
    DRUG_STRENGTH
    COHORT_DEFINITION
    ATTRIBUTE_DEFINITION 2015-03-06日进行至此
Standardized meta-data
    CDM_SOURCE
Standardized Clinical Data Tables
    PERSON
    OBSERVATION_PERIOD
    SPECIMEN
    DEATH
    VISIT_OCCURRENCE
    PROCEDURE_OCCURRENCE
    DRUG_EXPOSURE
    DEVICE_EXPOSURE
    CONDITION_OCCURRENCE
    MEASUREMENT
    NOTE
    OBSERVATION
    FACT_RELATIONSHIP
Standardized Health System Data Tables
    LOCATION
    CARE_SITE
    PROVIDER
Standardized Health Economics Data Tables
    PAYER_PLAN_PERIOD
    VISIT_COST
    PROCEDURE_COST
    DRUG_COST
    DEVICE_COST
Standardized Derived Elements
    COHORT
    COHORT_ATTRIBUTE
    DRUG_ERA
    DOSE_ERA
    CONDITION_ERA
wanghaisheng commented 9 years ago

数据模型V5 使用的术语说明

术语Abbr.描述
Ancestor 层级关系中位于高层的概念 需要注意的是ancestors and descendants 之间可能会隔了好多层.
Average Wholesale PriceAWP药房和医务人员整体采购时生产厂家为处方药定的价格.
Centers for Disease Control and PreventionCDC美国的疾控中心,主管公共卫生和安全相关.
Common Data ModelCDMCDM数据模型旨在简化异构数据库的观察性研究 定义了各种实体的数据结构 (e.g., Persons, Visit Occurrence, Drug Exposure, Condition Occurrence, Observation, Procedure Occurrence, etc.).包括与药物治疗和症状出现相关的数据项,既有标化字典,Standardized Vocabularies of terms 也有 entity domain tables.
Concept 概念也就是最小的信息单元。每个概念可能被划分到某个领域内,每个概念在每个领域内是唯一的,有固定 的标识和名称。 vertical relationships consist of “is a” statements that form a logical hierarchy. 一般而言 concepts above a given concept are referred to as ancestors and those below as descendants.
Conceptual Data Model A conceptual data model is a map of concepts and their relationships. This describes the semantics of an organization and represents a series of assertions about its nature. Specifically, it describes the things of significance to an organization (entity classes), about which it is inclined to collect information, and characteristics of (attributes) and associations between pairs of those things of significance (relationships).
Data mapping 不同数据模型、术语、概念之间数据项的映射 过程,它是数据集成的第一步
Demographics 人口学信息,包含了性别、民族、出生日期、居住地址等
Descendant 层级关系中的低级概念 ancestors and descendants 可能相隔多层.
Design Principle It identifies core principles and best practices to assist developers to produce software. Thoroughly understanding the goals of stakeholders and designing systems with those goals in mind are the best approaches to successfully deliver results.
Electronic Health RecordEHR电子健康档案 It may be made up of electronic medical records from many locations and/or sources. The EHR is a longitudinal electronic record of person health information generated by one or more encounters in any care delivery setting. Included in this information are person demographics, progress notes, problems, medications, vital signs, past medical history, immunizations, laboratory data and radiology reports.
Electronic Medical RecordEMRAn electronic medical record is a computerized medical record created in an organization that delivers care, such as a hospital or outpatient setting. Electronic medical records tend to be a part of a local stand-alone health information system that allows storage, retrieval and manipulation of records. This document will reference EHR moving forward even if specific data source might internally use EMR definition.
Extract Transform LoadETLProcess of getting data out of one data store (Extract), modifying it (Transform), and inserting it into a different data store (Load).
Health Insurance Portability and Accountability ActHIPAAA federal law that was designed to allow portability of health insurance between jobs. In addition, it required the creation of a federal law to protect personally identifiable health information; if that did not occur by a specific date (which it did not), HIPAA directed the Department of Health and Human Services (DHHS) to issue federal regulations with the same purpose. DHHS has issued HIPAA privacy regulations (the HIPAA Privacy Rule) as well as other regulations under HIPAA.
Logical Data Model Logical data models are graphical representation of the business requirements. They describe the things of importance to an organization and how they relate to one another, as well as business definitions and examples. The logical data model can be validated and approved by a business representative, and can be the basis of physical database design.
Primary Care ProviderPCP全科医生A health care provider designated as responsible to provide general medical care to a patient, including evaluation and treatment as well as referral to specialists.
Protected Health InformationPHIProtected health information under HIPAA includes any individually identifiable health information. Identifiable refers not only to data that is explicitly linked to a particular individual (that's identified information). It also includes health information with data items which reasonably could be expected to allow individual identification. De-identified information is that from which all potentially identifying information has been removed.
Terminology Technical or special terms used in a business or special subject area.
Vocabulary A computerized list (as of items of data or words) used for reference (as for information retrieval or word processing).
wanghaisheng commented 9 years ago

OMOP Common Data Model V5.0

授权协议

V5.0是基于Observational Medical Outcomes Partnership (OMOP)的,它的授权协议在此疑似已废弃 所有从OMOP CDM v4衍生出来的内容遵循CC0, Observational Health Data Sciences and Informatics (OHDSI)放弃了所有著作权和相关权利。具体请参考说明

背景介绍

OMOP观察性临床疗效联盟是一个公私合作的联盟,旨在利用观察性医疗数据库来研究医疗用品的效果。在5年的时间里通过社区来自业界、政府、学术界的的研究人员积极参与实现了如下目标:

在各种刊物上发表了多篇论文,在学术会议上进行了演讲,特别是http://omop.org/2013Symposium

整个社区利用OMOP的数据模型进行着各种各样的科学研究,这些工具的维护将一直进行下去,所产出的成果也将对大众开放。

OMOP研究室,旨在研究方法论,已经被转移到Reagan-Udall Foundation for the FDA 的 Innovation in Medical Evidence Development and Surveillance (IMEDS) Program,项目下面,改名为IMEDS Lab

OHDSI是一个跨机构、跨学科的合作组织,旨在开发能够通过大规模数据分析挖掘出观察性医疗数据价值的开源解决方案。OHDSI吸收了之前OMOP的研究人员,在OMOP数据模型的基础上进一步开发一些工具。

OMOP数据模型仍然会是开源的观察性医疗数据的社区标准。模型标准和相关的工作成果遵循public domain协议,鼓励研究人员使用这些工具来完成自己的科研项目。

统一的数据模型存在的价值

没有一个单一的观察性医疗数据源能够提供某个病人接受治疗的全景数据,因此,不足以满足我们想要完成的疗效分析的需求。在多个数据源之间并发的进行评估和分析就 需要一个统一的数据模型-OMOP CDM。

CDM设计之初,是为了满足对医疗干预(药物、手术、政策变化)与疗效(出现某种病情、手术、使用了某种药物)之间关联关系的甄别和评估的科研需要。疗效可以是有益或有害的。通常情况下,需要利用临床事件(诊断、观察、手术等)给治疗手段或疗效定义一些特殊的病人队列。 利用CDM和标准化术语能够保证系统地应用研究方法来得到有意义的可对比可重复的科研结果。

设计的原则

CDM中要包含所有与分析场景相关的观察性医疗数据项(包括患者就医时的体验)来得到有关疾病演变史、医疗服务、干预的效果和甄别人口统计学信息、干预手段与疗效关系的可信的科学/临床证据。

为了保存这样的数据,遵循以下设计原则:

数据模型介绍

CDM中总共有39张表,除了标准化医疗系统中的数据表/ standardized health system data tables(地点、医务人员、科室)(该表直接与不同领域的事件相关联)之外,每张表至少都有一个外键指向Person表,有一个日期字段。

表名说明
标准化术语/标化字典
CONCEPT该表包含从其他字典中衍生得到的能够准确的确定表达医疗信息的基本元素含义的所有记录。这些来自不同字典的术语,能够利用原始字典中的编码和描述信息表达不同领域的医学信息,诸如病情、症状、药品、手术。其中一些概念称之为标化概念,也就是说可以在OMOP数据模型和分析过程中来表达这些概念,每个标化概念都有一个主要的领域,领域确定了这个概念在OMOP数据模型中最可能出现的位置
VOCABULARY该表包含了从各处收集得到的字典,或者是由OMOP社区所自主创建的字典。每个字典对应着一条记录,每条记录中包含了名称和其他字典相关的属性.
DOMAIN该表包含了OMOP所定义的所有领域,每个标化术语一定从属于其中某一个领域。领域为每个标化字段定义了允许使用的概念集合。每个领域对应着一条记录,每条记录都有一个字段叫名称.
CONCEPT_CLASS该表包含了用于在某个字典中区分不同概念的所有分类,概念类型是由原始字典所定义的,可能会根据OMOP CDM表结构的约束稍作调整。每个概念类型对应着一条记录,每条记录都有一个字段叫名称.
CONCEPT_RELATIONSHIP该表包含了任意两个概念之间直接关系的所有记录,关系的类型在Relationship表中.
RELATIONSHIP该表包含了可以用在 CONCEPT_RELATIONSHIP表中关联任意两个概念的可使用的关系类型.
CONCEPT_SYNONYM该表用于存储同一个概念的不同的名称或者描述信息.
CONCEPT_ANCESTOR该表记录的是所有标化概念之间的推理用的层次关系。该表衍生自CONCEPT表,CONCEPT_RELATIONSHIP表和RELATIONSHIP表,旨在通过概念间完整的层级关系来简化分析。.
SOURCE_TO_CONCEPT_MAP该表可以在ETL过程中维护本地原始编码和概念的映射关系.
DRUG_STRENGTH该表包含了某种药品内某个成分的量或者浓度以及单位 ,用在药物使用的标准化分析过程中.
COHORT_DEFINITION该表包含了通过描述信息和特定于法定义队列的所有记录,也可以保存从OMOP数据模型直接生成队列的代码片段.
ATTRIBUTE_DEFINITION包含了定义属性的记录,每个属性是通过描述信息和特定语法来定义的。这里说的属性,也就是在一个队列中对象可以选择的或者计算的属性、字段.
标化元数据
CDM_SOURCE该表包含了原始数据库的详细信息,以及将数据转换成OMOP数据模型的过程。如果原始数据库是从多个数据来源得到的,需要ETL文档中记录这些异构数据源的集成过程。
标化医疗数据
PERSON该表记录了原始系统中 uniquely identify each patient in the source data who has time at-risk to have clinical events recorded within the source systems.
OBSERVATION_PERIOD该表中记录了某个人可能会产生一些记录在原始系统中的医疗事件的时间区间,即使是原生系统中并没有记录任何数据的情况(身体健康的人).
SPECIMEN该表记录了从个体身上采集的生物样本.
DEATH该表记录了个体何时以及如何死亡的医疗事件。如果原始系统中存在死亡相关数据,每个个体最多有一条记录。
VISIT_OCCURRENCE该表包含了某个个体连续不断的在某个医疗机构从一到多个医务人员接受医疗服务的时间区间。 就诊可以分为四大类:门诊、住院、急诊和长期护理。在一段诊疗过程中患者可能会在不同的科室间/医院转诊.
PROCEDURE_OCCURRENCE医务人员为了治疗、诊断患者所下的手术操作的医嘱,所执行的手术操作.
DRUG_EXPOSURE该表包含了所有摄入到人体内期望达到治疗效果的物质或其他引入到人体内的物品的使用记录。药品的花包括了处方药和非处方药、疫苗和大分子生物疗法.
DEVICE_EXPOSURE该表记录了个体使用外部的物体、设备的信息,以非生化反应的机制,用于辅助诊断或治疗。此类设备包括植入设备(支架、心脏起搏器、人工关节)、耐用的医疗设备和材料如(绷带、拐杖、注射器) 和其他设备 (缝线和除颤器).
CONDITION_OCCURRENCE记录患者出现了某种病情、疾病(某个诊断、症状、体征)的记录,可以是医生记录的,也可以是患者陈述的.
MEASUREMENT该表记录了所有测量值,诸如对某个个体或其样本进行系统的标准化检查检验过程中得到的结构化值(数值型或分类型) 该表既有实验室检验、生命体征和病理学报告中的定量数据的结果,也有各种医嘱
NOTE记录了医务人员或患者某天所记录的非结构化信息.
OBSERVATION该表记录了在检查、询问或者手术操作过程中记录的有关患者的医学事实。任意的数据都不属于其他任何领域,诸如社交和生活方式、医疗史、家族史等信息.
FACT_RELATIONSHIP记录了任意一个CDM的表中的记录中的事实间的关系。可以是同一个领域内不同事实间的关系,也可以是不同领域之间的。比如,患者之间的关系-父子、科室之间的关系-医疗机构内的层次化的组织架构、适应症的关系-药物与相关症状、用途之间的关系-某个手术操作中会用到的设备、从某个事实中衍生出的事实-从一个样本中得到的测量值.
标化的医疗系统数据
LOCATION
CARE_SITE包含了医疗服务实际上所发生的场所,物理上或者组织上。如科室、病区、医院、诊所等 .
PROVIDER唯一标识的医务人员的列表,包括医师、护士、理疗师、助产士.
标化的卫生经济学数据
PAYER_PLAN_PERIOD该表记录了某段时间内个体及其家庭人员连续不断的从某个保险商处的某个险种下登记的信息.
VISIT_COST该表记录的是患者的就诊费用,但没有具体到手术、药品、设备的明细信息.
PROCEDURE_COST该表记录的是某个个体是手术费用信息。The information about the cost is only derived from the amount of money paid for the Procedure.
DRUG_COST该表记录的是某个个体是药物费用信息. The information about the cost is defined by the amount of money paid by the person and payer for the drug, as well as the charged cost of the drug.
DEVICE_COST该表记录的是某个个体所使用的设备或材料的费用信息. The information about the cost is only derived from the amount of money paid for the device.
标化的衍生数据
COHORT包含了满足COHORT_DEFINITION表中的时间区间内某些准入条件的患者集合。队列可以由患者/个体、医务人员、就诊过程构成.
COHORT_ATTRIBUTE该表记录的是与队列中的每个对象相关的属性,队列属性的定义位于ATTRIBUTE_DEFINITION表.
DRUG_ERA假设个体暴露在某个药品成分的时间区间,比如在某些情况下药物使用的多个疗程就形成了一个连续的drug-era. successive periods of Drug Exposures combined under certain rules to produce continuous Drug Eras.
DOSE_ERA假设个体暴露在固定剂量的某个药品成分的时间区间 .
CONDITION_ERA假设个体出现某种病情的时间区间.
wanghaisheng commented 9 years ago

数据模型的约定

在CDM中采用了大量的隐式显式的约定做法。开发人员需要理解这样的一些约定。

数据模型的一般约定

CDM是与平台无关的,数据类型是通过ANSI SQL数据类型(VARCHAR, INTEGER, FLOAT, DATE, TIME, CLOB)来定义的。只有VARCHAR类型才指定了精度。这个精度只是说字符串的最小长度,在CDM中具体实现中可以对其进行扩展。CDM中对日期时间格式并不约束。对于CDM的标准化查询可能因CDM的具体实现、日期时间的配置各异。

大多数情况下,每个表中都有一个 “_id”字段, 可以在其他表中用作外键。

字段的一般约定

不同表中的不同字段名称遵循以下规则:

标记描述
<entity>_SOURCE_VALUE原始数据中的原始信息,旨在ETL过程中与CONCEPT_ID的对应。Verbatim information from the source data, typically used in ETL to map to CONCEPT_ID, and not to be used by any standard analytics. 例如 理赔数据中诊断编码是ICD-9,condition_source_value = ‘787.02’
<entity>_ID实体的唯一标识符,用作外键,构成不同实体间的关系。比如person_id 能够唯一的标识每个个体, visit_occurrence_id 能够唯一的标识某个人在某事某地的就诊活动.
<entity>_CONCEPT_ID标准化字典、术语的外键(如果是标准化概念的话,standard_concept字段值为true),主要用作标准化分析,例如,SNOMED中的概念‘Nausea’ 恶心,它的condition_concept_id为31967。Foreign key into the Standardized Vocabularies (i.e. the standard_concept attribute for the corresponding term is true), which serves as the primary basis for all standardized analytics For example, condition_concept_id = 31967 contains reference value for SNOMED concept of ‘Nausea’
<entity>_SOURCE_CONCEPT_ID原始数据中用到的概念和术语,标准化字典的外键,比如,在MedDRA字典中,表示恶心概念的condition_source_concept_id值为35708202,而 在SNOMED-CT(大多数临床所见和诊断的标准化术语/字典) 中condition_concept_id为31967。 Foreign key into the Standardized Vocabularies representing the concept and terminology used in the source data, when applicable For example, condition_source_concept_id = 35708202 denotes the concept of ‘Nausea’ in the MedDRA terminology; the analogous condition_concept_id might be 31967, since SNOMED-CT is the Standardized Vocabularies for most clinical diagnoses and findings.
<entity>_TYPE_CONCEPT_ID描述了在标准化字典中原始信息的来源,比如,研究人员可以利用drug_type_concept_id来区分 ‘Pharmacy dispensing’ and ‘Prescription written’(处方药和非处方药??) Delineates the origin of the source information, standardized within the Standardized Vocabularies For example, drug_type_concept_id can allow analysts to discriminate between ‘Pharmacy dispensing’ and ‘Prescription written’
wanghaisheng commented 9 years ago

使用概念Concepts来表达内容

在CDM数据表中,每条记录的内容的含义是用Concepts一个个概念来表达的。每个概念都是用一个_CONCEPT_ID 存储的,作为指向标准化字典的外键,也就是CONCEPT表中的 concept_id字段。如果不存在标准化概念或者是确定不了的情况,这个Concept的_CONCEPT_ID值为0,表示不存在该概念或者是对应不上。 the meaning of the content of each record is represented using Concepts. Concepts are stored with their concept_id as foreign keys to the CONCEPT table in the Standardized Vocabularies, which contains Concepts necessary to describe the healthcare experience of a patient. If a Standard Concept does not exist or cannot be identified, the Concept with the concept_id 0 is used, representing a non-existing or unmappable concept.

CONCEPT中的每条记录都记录了名称、关系、类型等详细信息,Concepts, Concept Relationships和其他信息都记录在标准化术语/标化字典类型的表中。 Records in the CONCEPT table contain all the detailed information about it (name, relationships, types etc.). Concepts, Concept Relationships and other information relating to Concepts contained in the tables of the Standardized Vocabularies..

Concept IDs 与 Source Values之间的区别

很多表中同样的信息都会重复出现:比如 Source Value, a Source Concept and as a Standard Concept.

只是出于方便和QA的目的才提供Source Values , Source Values 和 Source Concepts都是可选的,但 Standard Concepts是必须要有的. Source Values中包含的信息可能只是在某个数据源那里才会有意义。

general Concepts 通用概念与某个类型的概念之间的区别 Type Concepts

很多表中都出现了Type Concepts (ending in _type_concept_id) and general Concepts (ending in _concept_id) . 前者指的是这些数据是从哪个数据源拿到的特殊概念。比如可以使用Type Concept 的字段来区分一条DRUG_EXPOSURE 记录到底是 pharmacy-dispensing claim还是从ielectronic health record (EHR)中的处方(按照现在我们国内的体制,医药未分离,可直接根据药房是否从属于医院、医疗机构就可以进行判断).

可用数据的时间区间Time span of available data

存放医疗数据的数据表包括了一个日期戳(ending in _date, _start_date or _end_date), 表示医疗事件发生人日期时间,因此,每条记录都应该属于一个有效的 OBSERVATION_PERIOD 时间段内. 与前一个OBSERVATION_PERIOD相关的医疗信息,会记录在OBSERVATION表中, 'Medical history' (concept_id = 43054928), observation_date为前一个 observation_period_start_date of that patient, and the value_as_concept_id 则为对应的concept_id,要么是 condition/drug/procedure that occurred in the past. 任何发生在上一个 observation_period_end_date之后的记录都不是有效记录。

每张表的内容

CDM中大多数的表都只限于记录某个领域的概念。比如 CONDITION_OCCURRENCE 表只存储与病情、症状、诊断、体征相关的数据,不记录手术相关信息.但并非所有source coding schemes都遵循这样的规则。比如说 ICD-9-CM 编码, 其中包含了大量的疾病诊断编码,也包含了患者是否接受手术操作状态的编码。 V25.5 “Encounter for insertion of implantable subdermal contraceptive” 是一个手术操作,因此保存在 PROCEDURE_OCCURRENCE 表中.

source values, source concept ids, 与 standard concept ids的区别

每张表都包含source values, source concept ids, and standard concept ids这些字段

以下介绍每个领域内使用这三个字段来处理原始数据的约定:

当原始数据(source value)是自由文本,或者是引用了标准化字典还未收录了某个coding scheme :

当原始数据(source value)引用的是 标准化字典已经收录了的某个coding scheme:

每个standard concept_id 字段都是由一些可允许的concept_id 值的集合组成的. 这些可允许值则是由概念所属的领域所决定的。比如,性别这个领域概念而言,实际情况下只有2种标化概念(8507- ‘Male’, 8532- ‘Female’) 和一种通用概念(表示未知)(concept_id = 0).

对于 source_concept_id 字段而言,没有对concept_ids做出限制.

自定义的 source_to_concept_maps

如果原始数据中用到了眼下 Standardized Vocabularies (e.g. ICPC codes for diagnoses)中不存在的字典,就需要将原始编码source codes与标化概念的Standard Concepts对应关系存储到SOURCE_TO_CONCEPT_MAP 表中. 原始数据中的编码记录在source_value 字段中, 但没有source_concept_id字段

自定义的编码不允许映射到标化概念上,我们将其标记成invalid.

wanghaisheng commented 9 years ago

第一类、标化字典、标化术语

这类表中记录的是CDM事实表(fact table)中 用到的概念的详细信息。在实施CDM时并不会对修改标化字典,而是社区统一维护一套。 标化字典类数据表设计的前提条件:

这种做法的优势在于 The advantage of this approach lies in the preservation of codes and relationships between them without adherence to the multiple different source data structures, a simple design for standardized access, and the optimization of performance for analysis. Navigation among Standard Concepts does not require knowledge of the source vocabulary. Finally, the approach is scalable and future vocabularies can be integrated easily. On the other hand, extensive transformation of source data to the Vocabulary is required and not every source data structure and original source hierarchy can be retained.

OMOP Common Data Model中标化术语类E-R关系图:

CONCEPT表

该表包含从其他字典中衍生得到的能够准确的确定表达医疗信息的基本元素含义的所有记录。这些来自不同字典的术语,能够利用原始字典中的编码和描述信息表达不同领域的医学信息,诸如病情、症状、药品、手术。其中一些概念称之为标化概念,也就是说可以在OMOP数据模型和分析过程中来表达这些概念,每个标化概念都有一个主要的领域,领域确定了这个概念在OMOP数据模型中最可能出现的位置。

概念可以是宽泛的概念如心血管疾病Cardiovascular disease,也可以是某个具体的数据元素如前壁心肌梗死Myocardial infarction of the anterolateral wall,也可以是修饰不同层次概念的特征或属性如疾病严重程度、相关的形态学等等

这些概念都来源于国家层面、国际层面的术语标准如 SNOMED-CT, RxNorm, and LOINC和为了能够满足各种观察性数据分析的要求而自定义的概念。具体要了解对术语的规划请参考术语规范

FieldRequiredTypeDescription
concept_idYesintegerA unique identifier for each Concept across all domains.
concept_nameYesvarchar(255)An unambiguous, meaningful and descriptive name for the Concept.
domain_idYesvarchar(20)A foreign key to the DOMAIN table the Concept belongs to.
vocabulary_idYesvarchar(20)A foreign key to the VOCABULARY table indicating from which source the Concept has been adapted.
concept_class_idYesvarchar(20)The attribute or concept class of the Concept. Examples are “Clinical Drug”, “Ingredient”, “Clinical Finding” etc.
standard_conceptNovarchar(1)This flag determines where a Concept is a Standard Concept, i.e. is used in the data, a Classification Concept, or a non-standard Source Concept. The allowables values are 'S' (Standard Concept) and 'C' (Classification Concept), otherwise the content is NULL.
concept_codeYesvarchar(50)The concept code represents the identifier of the Concept in the source vocabulary, such as SNOMED-CT concept IDs, RxNorm RXCUIs etc. Note that concept codes are not unique across vocabularies.
valid_start_dateYesdateThe date when the Concept was first recorded. The default value is 1-Jan-1970, meaning, the Concept has no (known) date of inception.
valid_end_dateYesdateThe date when the Concept became invalid because it was deleted or superseded (updated) by a new concept. The default value is 31-Dec-2099, meaning, the Concept is valid until it becomes deprecated.
invalid_reasonNovarchar(1)Reason the Concept was invalidated. Possible values are D (deleted), U (replaced with an update) or NULL when valid_end_date has the default value.

约定:

不管是从外部的标化字典中还是自定义的概念,都必须遵循如下规则:

wanghaisheng commented 9 years ago

VOCABULARY 表

VOCABULARY表中罗列了从各处收集而来的字典,亦或是OMOP自己构造的。This reference table is populated with a single record for each Vocabulary source and includes a descriptive name and other associated attributes for the Vocabulary.

FieldRequiredTypeDescription
vocabulary_idYesvarchar(20)字典的唯一标识符ICD9CM, SNOMED, Visit.
vocabulary_nameYesvarchar(255)字典名称 如“International Classification of Diseases, Ninth Revision, Clinical Modification, Volume 1 and 2 (NCHS)” etc.
vocabulary_referenceYesvarchar(255)External reference to documentation or available download of the about the vocabulary.
vocabulary_versionYesvarchar(255)版本信息.
vocabulary_concept_idYesinteger外键,指向CONCEPT表,这条VOCABULARY记录所对应的标化概念标识符A foreign key that refers to a standard concept identifier in the CONCEPT table for the Vocabulary the VOCABULARY record belongs to.

约定

vocabulary_id previouslyvocabulary_id Version 5
0[None](/web/wiki/doku.php?id=documentation:vocabulary:none "documentation:vocabulary:none")
1[SNOMED](/web/wiki/doku.php?id=documentation:vocabulary:snomed "documentation:vocabulary:snomed")
2[ICD9CM](/web/wiki/doku.php?id=documentation:vocabulary:icd9cm "documentation:vocabulary:icd9cm")
3[ICD9Proc](/web/wiki/doku.php?id=documentation:vocabulary:icd9proc "documentation:vocabulary:icd9proc")
4[CPT4](/web/wiki/doku.php?id=documentation:vocabulary:cpt4 "documentation:vocabulary:cpt4")
5[HCPCS](/web/wiki/doku.php?id=documentation:vocabulary:hcpcs "documentation:vocabulary:hcpcs")
6[LOINC](/web/wiki/doku.php?id=documentation:vocabulary:loinc "documentation:vocabulary:loinc")
7[NDFRT](/web/wiki/doku.php?id=documentation:vocabulary:ndfrt "documentation:vocabulary:ndfrt")
8[RxNorm](/web/wiki/doku.php?id=documentation:vocabulary:rxnorm "documentation:vocabulary:rxnorm")
9[NDC](/web/wiki/doku.php?id=documentation:vocabulary:ndc "documentation:vocabulary:ndc")
10[GPI](/web/wiki/doku.php?id=documentation:vocabulary:gpi "documentation:vocabulary:gpi")
11[UCUM](/web/wiki/doku.php?id=documentation:vocabulary:ucum "documentation:vocabulary:ucum")
12[Gender](/web/wiki/doku.php?id=documentation:vocabulary:gender "documentation:vocabulary:gender")
13[Race](/web/wiki/doku.php?id=documentation:vocabulary:race "documentation:vocabulary:race")
14[Place of Service](/web/wiki/doku.php?id=documentation:vocabulary:place_of_service "documentation:vocabulary:place_of_service")
15[MedDRA](/web/wiki/doku.php?id=documentation:vocabulary:meddra "documentation:vocabulary:meddra")
16[Multum](/web/wiki/doku.php?id=documentation:vocabulary:multum "documentation:vocabulary:multum")
17[Read](/web/wiki/doku.php?id=documentation:vocabulary:read "documentation:vocabulary:read")
18[OXMIS](/web/wiki/doku.php?id=documentation:vocabulary:oxmis "documentation:vocabulary:oxmis")
19[Indication](/web/wiki/doku.php?id=documentation:vocabulary:indication "documentation:vocabulary:indication")
20[ETC](/web/wiki/doku.php?id=documentation:vocabulary:etc "documentation:vocabulary:etc")
21[ATC](/web/wiki/doku.php?id=documentation:vocabulary:atc "documentation:vocabulary:atc")
22[Multilex](/web/wiki/doku.php?id=documentation:vocabulary:multilex "documentation:vocabulary:multilex")
24[Visit](/web/wiki/doku.php?id=documentation:vocabulary:visit "documentation:vocabulary:visit")
28[VA Product](/web/wiki/doku.php?id=documentation:vocabulary:va_product "documentation:vocabulary:va_product")
31[SMQ](/web/wiki/doku.php?id=documentation:vocabulary:smq "documentation:vocabulary:smq")
32[VA Class](/web/wiki/doku.php?id=documentation:vocabulary:va_class "documentation:vocabulary:va_class")
33[Cohort](/web/wiki/doku.php?id=documentation:vocabulary:cohort "documentation:vocabulary:cohort")
34[ICD10](/web/wiki/doku.php?id=documentation:vocabulary:icd10 "documentation:vocabulary:icd10")
35[ICD10PCS](/web/wiki/doku.php?id=documentation:vocabulary:icd10pcs "documentation:vocabulary:icd10pcs")
36[Drug Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
37[Condition Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
38[Procedure Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
39[Observation Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
40[DRG](/web/wiki/doku.php?id=documentation:vocabulary:drg "documentation:vocabulary:drg")
41[MDC](/web/wiki/doku.php?id=documentation:vocabulary:mdc "documentation:vocabulary:mdc")
42[APC](/web/wiki/doku.php?id=documentation:vocabulary:apc "documentation:vocabulary:apc")
43[Revenue Code](/web/wiki/doku.php?id=documentation:vocabulary:revenue_code "documentation:vocabulary:revenue_code")
44[Ethnicity](/web/wiki/doku.php?id=documentation:vocabulary:ethnicity "documentation:vocabulary:ethnicity")
45[Death Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
46[Mesh](/web/wiki/doku.php?id=documentation:vocabulary:mesh "documentation:vocabulary:mesh")
47[NUCC](/web/wiki/doku.php?id=documentation:vocabulary:nucc "documentation:vocabulary:nucc")
48[Specialty](/web/wiki/doku.php?id=documentation:vocabulary:specialty "documentation:vocabulary:specialty")
49[LOINC](/web/wiki/doku.php?id=documentation:vocabulary:loinc "documentation:vocabulary:loinc")
50[SPL](/web/wiki/doku.php?id=documentation:vocabulary:spl "documentation:vocabulary:spl")
53[Genseqno](/web/wiki/doku.php?id=documentation:vocabulary:genseqno "documentation:vocabulary:genseqno")
54[CCS](/web/wiki/doku.php?id=documentation:vocabulary:ccs "documentation:vocabulary:ccs")
55[OPCS4](/web/wiki/doku.php?id=documentation:vocabulary:opcs4 "documentation:vocabulary:opcs4")
56[Gemscript](/web/wiki/doku.php?id=documentation:vocabulary:gemscript "documentation:vocabulary:gemscript")
57[HES Specialty](/web/wiki/doku.php?id=documentation:vocabulary:hes_specialty "documentation:vocabulary:hes_specialty")
58[Note Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
59[Domain](/web/wiki/doku.php?id=documentation:vocabulary:domain "documentation:vocabulary:domain")
60[PCORNet](/web/wiki/doku.php?id=documentation:vocabulary:pcornet "documentation:vocabulary:pcornet")
61[Obs Period Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
62[Visit Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
63[Device Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
64[Meas Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
65[Currency](/web/wiki/doku.php?id=documentation:vocabulary:currency "documentation:vocabulary:currency")
66[Relationship](/web/wiki/doku.php?id=documentation:vocabulary:relationship "documentation:vocabulary:relationship")
67[Vocabulary](/web/wiki/doku.php?id=documentation:vocabulary:vocabulary "documentation:vocabulary:vocabulary")
68[Concept Class](/web/wiki/doku.php?id=documentation:vocabulary:concept_class "documentation:vocabulary:concept_class")
69[Cohort Type](/web/wiki/doku.php?id=documentation:vocabulary:concept_type_vocabularies "documentation:vocabulary:concept_type_vocabularies")
70[ICD10CM](/web/wiki/doku.php?id=documentation:vocabulary:icd10cm "documentation:vocabulary:icd10cm")
wanghaisheng commented 9 years ago

DOMAIN 表

该表包含了标化字典中的概念所归属的领域的集合。Domain领域定义了在CDM表格的字段之中能够使用的允许概念的集合。比如,“Condition” 这个领域包含了能够描述患者病情的概念,而这些概念只能存在 CONDITION_OCCURRENCECONDITION_ERA 表的condition_concept_id 字段. 该表里每个领域对应着一条记录,其中还包括了领域的名称

FieldRequiredTypeDescription
domain_idYesvarchar(20)领域的唯一标识符.
domain_nameYesvarchar(255)领域的名称 e.g. “Condition”, “Procedure”, “Measurement” etc.
domain_concept_idYesinteger外键,指向 [CONCEPT](/web/wiki/doku.php?id=documentation:cdm:concept "documentation:cdm:concept") 表中的这条领域记录所对应的唯一的领域概念.

约定

wanghaisheng commented 9 years ago

CONCEPT_CLASS table

概念类型相当于卫生信息数据元目录中所涉及到的带类别的所有数据元,而领域相当于卫生信息数据元目录中的17大类

CONCEPT_CLASS表也是一个reference table, 其中包含了在一个字典内区分概念的分类集合。该表中一个概念类型对应着一条记录

FieldRequiredTypeDescription
concept_class_idYesvarchar(20)每种类型的唯一标识符.
concept_class_nameYesvarchar(255)概念类型的名称.例如 “临床所见Clinical Finding”, “成分Ingredient”, etc.
concept_class_concept_idYesinteger外键,该概念类型在 [CONCEPT](/web/wiki/doku.php?id=documentation:cdm:concept "documentation:cdm:concept") 表中的标识.

约定

concept_class previouslyconcept_class_id Version 5
Administrative conceptAdmin Concept
Admitting SourceAdmitting Source
Anatomical Therapeutic Chemical ClassificationATC
Anatomical Therapeutic Chemical ClassificationATC
APCProcedure
AttributeAttribute
Biobank FlagBiobank Flag
Biological functionBiological Function
Body structureBody Structure
Brand NameBrand Name
Branded DrugBranded Drug
Branded Drug ComponentBranded Drug Comp
Branded Drug FormBranded Drug Form
Branded PackBranded Pack
CCS_DIAGNOSISCondition
CCS_PROCEDURESProcedure
Chart AvailabilityChart Availability
Chemical StructureChemical Structure
Clinical DrugClinical Drug
Clinical Drug ComponentClinical Drug Comp
Clinical Drug FormClinical Drug Form
Clinical findingClinical Finding
Clinical PackClinical Pack
Concept RelationshipConcept Relationship
Condition Occurrence TypeCondition Occur Type
Context-dependent categoryContext-dependent
CPT-4Procedure
CurrencyCurrency
Death TypeDeath Type
Device TypeDevice Type
Discharge DispositionDischarge Dispo
Discharge StatusDischarge Status
DomainDomain
Dose FormDose Form
DRGDiagnostic Category
Drug Exposure TypeDrug Exposure Type
Drug InteractionDrug Interaction
Encounter TypeEncounter Type
Enhanced Therapeutic ClassificationETC
Enrollment BasisEnrollment Basis
Environment or geographical locationLocation
EthnicityEthnicity
EventEvent
GenderGender
HCPCSProcedure
Health Care Provider SpecialtyProvider Specialty
HES specialtyProvider Specialty
High Level Group TermHLGT
High Level TermHLT
HispanicHispanic
ICD-9-ProcedureProcedure
Indication or Contra-indicationInd / CI
IngredientIngredient
LOINC CodeMeasurement
LOINC Multidimensional ClassificationMeas Class
Lowest Level TermLLT
MDCDiagnostic Category
Measurement TypeMeas Type
Mechanism of ActionMechanism of Action
Model componentModel Comp
Morphologic abnormalityMorph Abnormality
MS-DRGDiagnostic Category
Namespace conceptNamespace Concept
Note TypeNote Type
Observable entityObservable Entity
Observation Period TypeObs Period Type
Observation TypeObservation Type
OMOP DOI cohortDrug Cohort
OMOP HOI cohortCondition Cohort
OPCS-4Procedure
OrganismOrganism
Patient StatusPatient Status
Pharmaceutical / biologic productPharma/Biol Product
Pharmaceutical PreparationsPharma Preparation
PharmacokineticsPK
Pharmacologic ClassPharmacologic Class
Physical forcePhysical Force
Physical objectPhysical Object
Physiologic EffectPhysiologic Effect
Place of ServicePlace of Service
Preferred TermPT
ProcedureProcedure
Procedure Occurrence TypeProcedure Occur Type
Qualifier valueQualifier Value
RaceRace
Record artifactRecord Artifact
Revenue CodeRevenue Code
SexGender
Social contextSocial Context
Special conceptSpecial Concept
SpecimenSpecimen
Staging and scalesStaging / Scales
Standardized MedDRA QuerySMQ
SubstanceSubstance
System Organ ClassSOC
Therapeutic ClassTherapeutic Class
UCUMUnit
UCUM CanonicalCanonical Unit
UCUM CustomUnit
UCUM StandardUnit
UndefinedUndefined
UNKNOWNUndefined
VA ClassDrug Class
VA Drug InteractionDrug Interaction
VA ProductDrug Product
VisitVisit
Visit TypeVisit Type
wanghaisheng commented 9 years ago

CONCEPT_RELATIONSHIP table

CONCEPT_RELATIONSHIP该表记录了任意两个Concepts概念之间的直接关系和这种关系的类型或者本质。[RELATIONSHIP]()表中定义了每种类型的关系

FieldRequiredTypeDescription
concept_id_1Yesinteger指向Concept表中的concept的外键.关系是方向性的,该字段表示的是源概念定义
concept_id_2Yesinteger指向Concept表中的concept的外键.关系是方向性的,该字段表示的是目标概念定义.
relationship_idYesvarchar(20)定义在[RELATIONSHIP]( )表中的某种类型的关系的唯一标识符 .
valid_start_dateYesdate这个概念关系记录之初的日期 .
valid_end_dateYesdate由于该概念关系被删除或被新的取代而导致无效的日期,默认值为 31-Dec-2099.
invalid_reasonNovarchar(1)记录该关系无效的原因.允许值为 'D' (deleted), 'U' (replaced with an update) or NULL when valid_end_date has the default value.

约束

  • 同一个relationship_id,同样的概念,只有一条概念关系的记录 There is one record for each Concept Relationship connecting the same Concepts with the same relationship_id.
  • 鉴于所有概念关系都是镜像存在的(concept_id_1和concept_id_2互换,relationship_id用RELATIONSHIP表中相反的id替代),没有必要同时查询concept_id_1和concept_id_2两个字段
  • 概念关系定义的是概念间的直接关系,间接关系不在此表中。 CONCEPT_ANCESTOR table 表中通过“generations” of direct relationships记录了层级关系
  • 之前版本的CDM中,relationship_id是数值型标识符。参考[RELATIONSHIP]() 表.
  • wanghaisheng commented 9 years ago

    RELATIONSHIP 表

    RELATIONSHIP表中是所有类型的关系的参考列表,这些关系用于关联CONCEPT_RELATIONSHP表中的任意2个概念

    FieldRequiredTypeDescription
    relationship_idYesvarchar(20)这条记录中所表示的关系的类型.
    relationship_nameYesvarchar(255) 描述关系类型的文本
    is_hierarchicalYesvarchar(1)标记该关系是否是层级关系或类型关系。Values are 1 for hierarchical relationship or 0 if not.
    defines_ancestryYesvarchar(1)Defines whether a hierarchical relationship contributes to the concept_ancestor table. These are subsets of the hierarchical relationships. Valid values are 1 or 0.
    reverse_relationship_idYesvarchar(20)该概念关系方向的概念关系的标识符.
    relationship_concept_idYesintegerA foreign key that refers to an identifier in the CONCEPT table for the unique relationship concept.
    ### Conventions
    *
    There is one record for each Relationship.
    *
    Relationships are classified as hierarchical (parent-child) or non-hierarchical (lateral)
    *
    They are used to determine which concept relationship records should be included in the computation of the CONCEPT_ANCESTOR table.
    *
    The relationship_id field contains an alphanumerical identifier, that can also be used as the abbreviation of the Relationship.
    *
    The relationship_name field contains the unabbreviated names of the Relationship.
    *
    Relationships all exist symmetrically, i.e. in both direction. The relationship_id of the opposite Relationship is provided in the reverse_relationship_id field.
    *
    Each Relationship also has an equivalent entry in the Concept table, which is recorded in the relationship_concept_id field. This is for purposes of creating a closed Information Model, where all entities in the OMOP CDM are covered by unique Concepts.
    *
    Hierarchical Relationships are used to build a hierarchical tree out of the Concepts, which is recorded in the CONCEPT_ANCESTOR table. For example, “has_ingredient” is a Relationship between Concepst of the Concept Class 'Clinical Drug' and those of 'Ingredient', and all Ingredients can be classified as the “parental” hierarchical Concepts for the drug products they are part of. All 'Is a' Relationships are hierarchical.
    *
    Relationships, also hierarchical, can be between Concepts within the same Vocabulary or those adopted from different Vocabulary sources.
    *
    In past versions of the RELATIONSHIP table, the relationship_id used to be a numerical value. A conversion table between these old and new IDs is given below:
    relationship_id previouslyrelationship_id Version 5
    1LOINC replaced by
    2Has precise ing
    3Has tradename
    4RxNorm has dose form
    5Has form
    6RxNorm has ing
    7Constitutes
    8Contains
    9Reformulation of
    10Subsumes
    11NDFRT has dose form
    12Induces
    13May diagnose
    14Has physio effect
    15Has CI physio effect
    16NDFRT has ing
    17Has CI chem class
    18Has MoA
    19Has CI MoA
    20Has PK
    21May treat
    22CI to
    23May prevent
    24Has metabolites
    25Has metabolism
    26May be inhibited by
    27Has chem structure
    28NDFRT - RxNorm eq
    29Has recipient cat
    30Has proc site
    31Has priority
    32Has pathology
    33Has part of
    34Has severity
    35Has revision status
    36Has access
    37Has occurrence
    38Has method
    39Has laterality
    40Has interprets
    41Has indir morph
    42Has indir device
    43Has specimen
    44Has interpretation
    45Has intent
    46Has focus
    47Has manifestation
    48Has active ing
    49Has finding site
    50Has episodicity
    51Has dir subst
    52Has dir morph
    53Has dir device
    54Has component
    55Has causative agent
    56Has asso morph
    57Has asso finding
    58Has measurement
    59Has property
    60Has scale type
    61Has time aspect
    62Has specimen proc
    63Has specimen source
    64Has specimen morph
    65Has specimen topo
    66Has specimen subst
    67Has due to
    68Has relat context
    69Has dose form
    70Occurs after
    71Has asso proc
    72Has dir proc site
    73Has indir proc site
    74Has proc device
    75Has proc morph
    76Has finding context
    77Has proc context
    78Has temporal context
    79Findinga sso with
    80Has surgical appr
    81Using device
    82Using energy
    83Using subst
    84Using acc device
    85Has clinical course
    86Has route of admin
    87Using finding method
    88Using finding inform
    92ICD9P - SNOMED eq
    93CPT4 - SNOMED cat
    94CPT4 - SNOMED eq
    125MedDRA - SNOMED eq
    126Has FDA-appr ind
    127Has off-label ind
    129Has CI
    130ETC - RxNorm
    131ATC - RxNorm
    132SMQ - MedDRA
    135LOINC replaces
    136Precise ing of
    137Tradename of
    138RxNorm dose form of
    139Form of
    140RxNorm ing of
    141Consists of
    142Contained in
    143Reformulated in
    144Is a
    145NDFRT dose form of
    146Induced by
    147Diagnosed through
    148Physiol effect by
    149CI physiol effect by
    150NDFRT ing of
    151CI chem class of
    152MoA of
    153CI MoA of
    154PK of
    155May be treated by
    156CI by
    157May be prevented by
    158Metabolite of
    159Metabolism of
    160Inhibits effect
    161Chem structure of
    162RxNorm - NDFRT eq
    163Recipient cat of
    164Proc site of
    165Priority of
    166Pathology of
    167Part of
    168Severity of
    169Revision status of
    170Access of
    171Occurrence of
    172Method of
    173Laterality of
    174Interprets of
    175Indir morph of
    176Indir device of
    177Specimen of
    178Interpretation of
    179Intent of
    180Focus of
    181Manifestation of
    182Active ing of
    183Finding site of
    184Episodicity of
    185Dir subst of
    186Dir morph of
    187Dir device of
    188Component of
    189Causative agent of
    190Asso morph of
    191Asso finding of
    192Measurement of
    193Property of
    194Scale type of
    195Time aspect of
    196Specimen proc of
    197Specimen identity of
    198Specimen morph of
    199Specimen topo of
    200Specimen subst of
    201Due to of
    202Relat context of
    203Dose form of
    204Occurs before
    205Asso proc of
    206Dir proc site of
    207Indir proc site of
    208Proc device of
    209Proc morph of
    210Finding context of
    211Proc context of
    212Temporal context of
    213Asso with finding
    214Surgical appr of
    215Device used by
    216Energy used by
    217subst used by
    218Acc device used by
    219Clinical course of
    220Route of admin of
    221Finding method of
    222Finding inform of
    226SNOMED - ICD9P eq
    227SNOMED cat - CPT4
    228SNOMED - CPT4 eq
    239SNOMED - MedDRA eq
    240Is FDA-appr ind of
    241Is off-label ind of
    243Is CI of
    244RxNorm - ETC
    245RxNorm - ATC
    246MedDRA - SMQ
    247Ind/CI - SNOMED
    248SNOMED - ind/CI
    275Has therap class
    276Therap class of
    277Drug-drug inter for
    278Has drug-drug inter
    279Has pharma prep
    280Pharma prep in
    281Inferred class of
    282Has inferred class
    283SNOMED proc - HCPCS
    284HCPCS - SNOMED proc
    285RxNorm - NDFRT name
    286NDFRT - RxNorm name
    287ETC - RxNorm name
    288RxNorm - ETC name
    289ATC - RxNorm name
    290RxNorm - ATC name
    291HOI - SNOMED
    292SNOMED - HOI
    293DOI - RxNorm
    294RxNorm - DOI
    295HOI - MedDRA
    296MedDRA - HOI
    297NUCC - CMS Specialty
    298CMS Specialty - NUCC
    299DRG - MS-DRG eq
    300MS-DRG - DRG eq
    301DRG - MDC cat
    302MDC cat - DRG
    303Visit cat - PoS
    304PoS - Visit cat
    305VAProd - NDFRT
    306NDFRT - VAProd
    307VAProd - RxNorm eq
    308RxNorm - VAProd eq
    309RxNorm replaced by
    310RxNorm replaces
    311SNOMED replaced by
    312SNOMED replaces
    313ICD9P replaced by
    314ICD9P replaces
    315Multilex has ing
    316Multilex ing of
    317RxNorm - Multilex eq
    318Multilex - RxNorm eq
    319Multilex ing - class
    320Class - Multilex ing
    321Maps to
    322Mapped from
    325Map includes child
    326Included in map from
    327Map excludes child
    328Excluded in map from
    345UCUM replaced by
    346UCUM replaces
    347Concept replaced by
    348Concept replaces
    349Concept same_as to
    350Concept same_as from
    351Concept alt_to to
    352Concept alt_to from
    353Concept poss_eq to
    354Concept poss_eq from
    355Concept was_a to
    356Concept was_a from
    357SNOMED meas - HCPCS
    358HCPCS - SNOMED meas
    359Domain subsumes
    360Is domain
    wanghaisheng commented 9 years ago

    CONCEPT_SYNONYM 表

    该表用于存储概念的其他名称和描述。

    FieldRequiredTypeDescription
    concept_idYesInteger CONCEPT表中概念的外键.
    concept_synonym_nameYesvarchar(1000)概念的其他名称.
    language_concept_idYesinteger表示语言的概念的外键 .
    ### Conventions - concept_name字段包含了概念的同义词,其中也包括了concept_name自身的描述,比如每个概念至少在CONCEPT_SYNONYM表中都有一个Synonym同义词.例如,对于SNOMED-CT概念,如果CONCEPT表中的concept_name字段中存储的是完整的名称,该概念对应的 Preferred Term and Synonyms就会存到 CONCEPT_SYNONYM 表. - 只有在用的同义词才会存到CONCEPT_SYNONYM表中。同义词的历史版本和废弃同义词与现用同义词的映射不在标准化术语的讨论范围之内 - 目前,只包含了英文的同义词
    wanghaisheng commented 9 years ago

    CONCEPT_ANCESTOR 表

    设计该表是为了简化观察性分析,通过提供概念间完整的层级关系。CONCEPT_RELATIONSHIP表中存储的是直接的父子型概念关系。为了确定高层的父子联系,需要在分析时浏览所有的直接关系。CONCEPT_ANCESTOR表中记录了所有的父子型关系,曾父-曾孙关系,和他们间的层级关系。通过该表可以查询某个层级概念的所有后代。比如 drug ingredients and drug products are all descendants of a drug class ancestor.

    该表整个是从 CONCEPT, CONCEPT_RELATIONSHIP and RELATIONSHIP 表衍生而来.

    FieldRequiredTypeDescription
    ancestor_concept_idYesintegerA foreign key to the concept in the concept table for the higher-level concept that forms the ancestor in the relationship.
    descendant_concept_idYesintegerA foreign key to the concept in the concept table for the lower-level concept that forms the descendant in the relationship.
    min_levels_of_separationYesintegerThe minimum separation in number of levels of hierarchy between ancestor and descendant concepts. This is an attribute that is used to simplify hierarchic analysis.
    max_levels_of_separationYesintegerThe maximum separation in number of levels of hierarchy between ancestor and descendant concepts. This is an attribute that is used to simplify hierarchic analysis.

    Conventions

    wanghaisheng commented 9 years ago

    SOURCE_TO_CONCEPT_MAP 表

    该表是OMOP数据模型中的残留数据结构,主要用在ETL过程中,维护那些在标准化术语中不存在的本地原始编码,建立每个原始编码与标准化概念之间的映射关系。向社区所发布的标准化术语中该表中将不包含任何内容

    FieldRequiredTypeDescription
    source_codeYesvarchar(50)The source code being translated into a Standard Concept.
    source_concept_idYesintegerA foreign key to the Source Concept that is being translated into a Standard Concept.
    source_vocabulary_idNovarchar(20)A foreign key to the VOCABULARY table defining the vocabulary of the source code that is being translated to a Standard Concept.
    source_code_descriptionYesvarchar(255)An optional description for the source code. This is included as a convenience to compare the description of the source code to the name of the concept.
    target_concept_idYesintegerA foreign key to the target Concept to which the source code is being mapped.
    target_vocabulary_idYesvarchar(20)A foreign key to the VOCABULARY table defining the vocabulary of the target Concept.
    valid_start_dateYesdateThe date when the mapping instance was first recorded.
    valid_end_dateYesdateThe date when the mapping instance became invalid because it was deleted or superseded (updated) by a new relationship. Default value is 31-Dec-2099.
    invalid_reasonNovarchar(1)Reason the mapping instance was invalidated. Possible values are D (deleted), U (replaced with an update) or NULL when valid_end_date has the default value.
    ### Conventions - 以后将不使用该表来表达原始编码与标准化概念之间的映射关系。相反,现在使用CONCEPT_RELATIONSHIP来表示映射关系,relationship_id的值为'Maps to' - 但是,该表仍可以用于本地编码和标化概念的转换 - *_Note: *_该表不能用于原始编码和原始概念间的转换。原始概念的原始编码记录在concept_code字段中,如果某个数据库中原始编码不符合格式,ETL需要进行转换。比如,ICD-9-CM编码没有点号,ETL需要查询以确认正确的原始编码 - source_concept_id或者 source_code 和 the source_vocabulary_id组合起来能够唯一标识源信息。它等同于与CONCEPT_RELATIONSHIP表中的concept_id_1 - 如果没有source_concept_id, If there is no source_concept_id available because the source codes are local and not supported by the Standard Vocabulary, the content of the field is 0 (zero, not null) encoding an undefined concept. However, local Source Concepts are established (concept_id values above 2,000,000,000). - source_code_description可以包含对原始编码的描述 - target_concept_id包含了原始编码所对应的概念,等同于CONCEPT_RELATIONSHIP表中的 concept_id_2 - target_vocabulary_id包含了目标概念的vocabulary_id。它是对目标概念的CONCPEPT记录的信息的复制 - valid_start_date, valid_end_date and invalid_reason三个字段用于定义映射信息的生命周期. Invalid mapping records should not be used for mapping information
    wanghaisheng commented 9 years ago

    DRUG_STRENGTH 表

    该表记录的是某个药品之中的某种成分的含量、浓度以及单位。这些信息是用于支持对药品利用进行标准化分析的辅助信息

    FieldRequiredTypeDescription
    drug_concept_idYesintegerCONCEPT表之中表示商品药或临床药物概念的外键.
    ingredient_concept_idYesintegerCONCEPT表之中表示药品中药物成分概念的外键.
    amount_valueNofloat药品中所含的有效成分的量 数值.
    amount_unit_concept_idNointegerCONCEPT表之中表示药品有效成分含量单位概念的外键.
    numerator_valueNofloat药品中有效成分浓度的分子数值
    numerator_unit_concept_idNointegerCONCEPT表之中表示有效成分浓度的分子单位概念的外键.
    denominator_unit_concept_idNointegerCONCEPT表之中表示有效成分浓度的分母单位概念的外键.
    valid_start_dateYesdateThe date when the Concept was first recorded. The default value is 1-Jan-1970.
    valid_end_dateYesdateThe date when the concept became invalid because it was deleted or superseded (updated) by a new Concept. The default value is 31-Dec-2099.
    invalid_reasonNovarchar(1)Reason the concept was invalidated. Possible values are 'D' (deleted), 'U' (replaced with an update) or NULL when valid_end_date has the default value.

    约束

  • 目前,drug_strength中只包含了RxNorm所罗列的药物
  • wanghaisheng commented 9 years ago

    COHORT_DEFINITION 表

    该表中包含了如何通过描述和语法定义由数据衍生而来的队列Cohort,算法的执行结果则保存在COHORT 表之中。一个队列也就是在一段时间内满足特定准入条件的试验对象的集合。该表中使用标准化结构来维护将试验对象纳入队列的规则,也可保存程序的编码通过OMOP数据模型来实例化队列。

    FieldRequiredTypeDescription
    cohort_definition_idYesinteger每个队列定义的唯一标识符
    cohort_definition_nameYesvarchar(255)队列定义的名称
    cohort_definition_descriptionNoCLOB队列定义的完整描述
    definition_type_concept_idYesinteger队列定义的类型,以及该语法该如何执行
    cohort_definition_syntaxNoCLOB实例化队列定义的语法或可执行的编码
    subject_concept_idYesinteger试验对象领域(e.g., Person, Provider, Visit)概念的外键.
    cohort_instantiation_dateNoDateA date to indicate when the Cohort was instantiated in the COHORT table

    约束

    wanghaisheng commented 9 years ago

    ATTRIBUTE_DEFINITION 表

    该表包含了通过描述、语法来定义队列中成员的属性、协变量的记录,算法执行结果保存在COHORT_ATTRIBUTE表之中。Attributes也就是衍生而来的元素,对于一个队列中的对象,可以选择、计算这些元素值。也可保存通过OMOP数据模型利用可执行的程序代码来实例化特定队列的这些属性。

    FieldRequiredTypeDescription
    attribute_definition_idYesintegerA unique identifier for each Attribute.
    attribute_nameYesvarchar(255)A short description of the Attribute.
    attribute_descriptionNoCLOBA complete description of the Attribute definition
    attribute_type_concept_idYesintegerType defining what kind of Attribute Definition the record represents and how the syntax may be executed
    attribute_syntaxNoCLOBSyntax or code to operationalize the Attribute definition

    约束

    wanghaisheng commented 9 years ago

    第二类、标准化元数据

    在ETL过程中可以从数据本身衍生而成的数据的元数据。只有一张表

    CDM_SOURCE 表

    该表记录了原始数据库以及将原始数据转化成OMOP 数据模型的过程信息

    FieldRequiredTypeDescription
    cdm_source_nameYesvarchar(255)The full name of the source
    cdm_source_abbrevationNovarchar(25)An abbreviation of the name
    cdm_holderNovarchar(255)负责开发CDM的组织机构名称
    source_descriptionNoCLOBA description of the source data origin and purpose for collection. The description may contain a summary of the period of time that is expected to be covered by this dataset.
    source_documentation_referenceNovarchar(255)URL or other external reference to location of source documentation
    cdm_etl _referenceNovarchar(255)URL or other external reference to location of ETL specification documentation and ETL source code
    source_release_dateNodateThe date for which the source data are most current, such as the last day of data capture
    cdm_release_dateNodateThe date when the CDM was instantiated
    cdm_versionNovarchar(10)The version of CDM used
    vocabulary_versionNovarchar(20)The version of the vocabulary used

    约束

    wanghaisheng commented 9 years ago

    第三类、标准化临床数据

    每个人的人口学信息以及长期以来的每个观察期内的医疗事件相关的信息。

    wanghaisheng commented 9 years ago

    PERSON table

    原始系统中存在观察记录的每个患者

    FieldRequiredTypeDescription
    person_idYesinteger患者的唯一标识符.
    gender_concept_idYesintegerCONCEPT表中表示患者性别的概念的外键.
    year_of_birth Yesinteger患者的出生年份,精度只到年。The year of birth of the person. For data sources with date of birth, the year is extracted. For data sources where the year of birth is not available, the approximate year of birth is derived based on any age group categorization available.
    month_of_birthNointeger患者的出生月份. For data sources that provide the precise date of birth, the month is extracted and stored in this field.
    day_of_birthNointeger患者的出生日期. For data sources that provide the precise date of birth, the day is extracted and stored in this field.
    time_of_birthNotime患者的出生时间
    race_concept_idYesintegerA foreign key that refers to an identifier in the CONCEPT table for the unique race of the person.
    ethnicity_concept_idYesintegerA foreign key that refers to the standard concept identifier in the Standardized Vocabularies for the ethnicity of the person.
    location_idNointegerA foreign key to the place of residency for the person in the location table, where the detailed address information is stored.
    provider_idNointegerprovider表中患者正在看的医生的外键.
    care_site_idNointegerA foreign key to the site of primary care in the care_site table, where the details of the care site are stored.
    person_source_valueNovarchar(50)对源数据库中患者标识进行加密处理得到的字符串。有些情况下,我们需要将患者关联到原始的数据库当中。An (encrypted) key derived from the person identifier in the source data. This is necessary when a use case requires a link back to the person data at the source dataset.
    gender_source_valueNovarchar(50)原始数据库之中患者的性别编码。The source code for the gender of the person as it appears in the source data. The person’s gender is mapped to a standard gender concept in the Standardized Vocabularies; the original value is stored here for reference.
    gender_source_concept_idNoIntegerA foreign key to the gender concept that refers to the code used in the source.
    race_source_valueNovarchar(50)The source code for the race of the person as it appears in the source data. The person race is mapped to a standard race concept in the Standardized Vocabularies and the original value is stored here for reference.
    race_source_concept_idNoIntegerA foreign key to the race concept that refers to the code used in the source.
    ethnicity_source_valueNovarchar(50)The source code for the ethnicity of the person as it appears in the source data. The person ethnicity is mapped to a standard ethnicity concept in the Standardized Vocabularies and the original code is, stored here for reference.
    ethnicity_source_concept_idNoIntegerA foreign key to the ethnicity concept that refers to the code used in the source.

    Conventions

    wanghaisheng commented 9 years ago

    OBSERVATION_PERIOD 表

    OBSERVATION_PERIOD 表记录了时间范围/段,该时间段内 原始的信息系统中有可能会存在患者所发生的医疗服务的记录,即使这段时间内实际上没有任何记录(这说明该患者处于健康状态)。

    FieldRequiredTypeDescription
    observation_period_idYesinteger该观察时间段的唯一标识符.
    person_idYesinteger患者的唯一标识符的外键.与患者相关的人口学信息记录在person表之中.
    observation_period_start_dateYesdate原始系统中这段时间段内开始有数据的开始日期.
    observation_period_end_dateYesdate结束日期.
    period_type_concept_idYesInteger标准词汇表中预先定义好的能够表示观察时间段信息来源的概念的外键A foreign key identifier to the predefined concept in the Standardized Vocabularies reflecting the source of the observation period information

    Conventions

  • Each Person can have more than one valid OBSERVATION_PERIOD record, but no two observation periods can overlap in time for a given person.
  • As a general assumption, during an Observation Period any clinical event that happens to the patient is expected to be recorded. Conversely, the absence of data indicates that no clinical events occurred to the patient.
  • No clinical data are valid outside an active Observation Period. Clinical data that refer to a time outside (diagnoses of previous conditions such as “Old MI” or medical history) of an active Observation Period are recorded as Observations. The date of the Observation is the first day of the first Observation Period of a patient.
  • For claims data, observation periods are inferred from the enrollment periods to a health benefit plan.
  • For EHR data, the observation period cannot be determined explicitly, because patients usually do not announce their departure from a certain healthcare provider. The ETL will have to apply some heuristic to make a reasonable guess on what the observation_period should be. Refer to the ETL documentation for details.
  • wanghaisheng commented 9 years ago

    specimen 表

    The specimen domain contains the records identifying biological samples from a person.

    FieldRequiredTypeDescription
    specimen_idYesintegerA unique identifier for each specimen.
    person_idYesintegerA foreign key identifier to the Person for whom the Specimen is recorded.
    specimen_concept_idYesintegerA foreign key referring to a Standard Concept identifier in the Standardized Vocabularies for the Specimen.
    specimen_type_concept_idYesintegerA foreign key referring to the Concept identifier in the Standardized Vocabularies reflecting the system of record from which the Specimen was represented in the source data.
    specimen_dateYesdateThe date the specimen was obtained from the Person.
    specimen_timeNotimeThe time on the date when the Specimen was obtained from the person.
    quantityNofloatThe amount of specimen collection from the person during the sampling procedure.
    unit_concept_idNointegerA foreign key to a Standard Concept identifier for the Unit associated with the numeric quantity of the Specimen collection.
    anatomic_site_concept_idNointegerA foreign key to a Standard Concept identifier for the anatomic location of specimen collection.
    disease_status_concept_idNointegerA foreign key to a Standard Concept identifier for the Disease Status of specimen collection.
    specimen_source_idNovarchar(50)The Specimen identifier as it appears in the source data.
    specimen_source_valueNovarchar(50)The Specimen value as it appears in the source data. This value is mapped to a Standard Concept in the Standardized Vocabularies and the original code is, stored here for reference.
    unit_source_valueNovarchar(50)The information about the Unit as detailed in the source.
    anatomic_site_source_valueNovarchar(50)The information about the anatomic site as detailed in the source.
    disease_status_source_valueNovarchar(50)The information about the disease status as detailed in the source.

    Conventions

    *
    Anatomic site is coded at the most specific level of granularity possible, such that higher level classifications can be derived using the Standardized Vocabularies.

    wanghaisheng commented 9 years ago

    DEATH table

    The death domain contains the clinical event for how and when a Person dies. A person can have up to one record if the source system contains evidence about the Death, such as:

    Conventions

    *
    Living patients should not contain any information in the DEATH table.

    wanghaisheng commented 9 years ago

    VISIT_OCCURRENCE table

    The VISIT_OCCURRENCE table contains the spans of time a Person continuously receives medical services from one or more providers at a Care Site in a given setting within the health care system. Visits are classified into 4 settings: outpatient care, inpatient confinement, emergency room, and long-term care. Persons may transition between these settings over the course of an episode of care (for example, treatment of a disease onset).
    FieldRequiredTypeDescription
    visit_occurrence_idYesintegerA unique identifier for each Person's visit or encounter at a healthcare provider.
    person_idYesintegerA foreign key identifier to the Person for whom the visit is recorded. The demographic details of that Person are stored in the PERSON table.
    visit_concept_idYesintegerA foreign key that refers to a visit Concept identifier in the Standardized Vocabularies.
    visit_start_dateYesdateThe start date of the visit.
    visit_start_timeNotimeThe time the visit started.
    visit_end_dateYesdateThe end date of the visit. If this is a one-day visit the end date should match the start date.
    visit_end_timeNotimeThe time the visit ended.
    visit_type_concept_idYesIntegerA foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the type of source data from which the visit record is derived.
    provider_idNointegerA foreign key to the provider in the provider table who was associated with the visit.
    care_site_idNointegerA foreign key to the care site in the care site table that was visited.
    visit_source_valueNoVarchar(50)The source code for the visit as it appears in the source data.
    visit_source_concept_idNoIntegerA foreign key to a Concept that refers to the code used in the source.

    Conventions

    *
    A Visit Occurrence is recorded for each visit to a healthcare facility.
    • Valid Visit Concepts belong to the “Visit” domain.
    • Standard Visit Concepts are defined as Inpatient Visit, Outpatient Visit, Emergency Room Visit and Long Term Care Visit. Source Concepts from place of service vocabularies are mapped into these standard visit Concepts in the Standardized Vocabularies. Each Visit is standardized by assigning a corresponding Concept Identifier based on the type of facility visited and the type of services rendered.
    • At any one day, there could be more than one visit.
    • One visit may involve multiple providers, in which case the ETL must specify how a single provider id is selected or leave the provider_id field null.
    • One visit may involve multiple Care Sites, in which case the ETL must specify how a single care_site id is selected or leave the care_site_id field null.
    • Visits are recorded in various data sources in different forms with varying levels of standardization. For example:
      • Medical Claims include Inpatient Admissions, Outpatient Services, and Emergency Room visits.
      • Electronic Health Records may capture Person visits as part of the activities recorded depending wether the EHR system is used at the different Care Sites.
    • If applicable, the sequential relationships between Visits within an episode of care may be represented in the FACT_RELATIONSHIP table.
    wanghaisheng commented 9 years ago

    PROCEDURE_OCCURRENCE table

    The PROCEDURE_OCCURRENCE tabe contains records of activities or processes ordered by, or carried out by, a healthcare provider on the patient to have a diagnostic or therapeutic purpose. Procedures are present in various data sources in different forms with varying levels of standardization. For example: *
    Medical Claims include procedure codes that are submitted as part of a claim for health services rendered, including procedures performed.
    • Electronic Health Records that capture procedures as orders.
      FieldRequiredTypeDescription
      procedure_occurrence_idYesintegerA system-generated unique identifier for each Procedure Occurrence.
      person_idYesintegerA foreign key identifier to the Person who is subjected to the Procedure. The demographic details of that Person are stored in the PERSON table.
      procedure_concept_idYesintegerA foreign key that refers to a standard procedure Concept identifier in the Standardized Vocabularies.
      procedure_dateYesdateThe date on which the Procedure was performed.
      procedure_type_concept_idYesintegerA foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the type of source data from which the procedure record is derived.
      modifier_concept_idNointegerA foreign key to a Standard Concept identifier for a modifier to the Procedure (e.g. bilateral)
      quantityNointegerThe quantity of procedures ordered or administered.
      provider_idNointegerA foreign key to the provider in the provider table who was responsible for carrying out the procedure.
      visit_occurrence_idNointegerA foreign key to the visit in the visit table during which the Procedure was carried out.
      procedure_source_valueNovarchar(50)The source code for the Procedure as it appears in the source data. This code is mapped to a standard procedure Concept in the Standardized Vocabularies and the original code is, stored here for reference. Procedure source codes are typically ICD-9-Proc, CPT-4, HCPCS or OPCS-4 codes.
      procedure_source_concept_idNointegerA foreign key to a Procedure Concept that refers to the code used in the source.
      qualifier_source_valueNovarchar(50)The source code for the qualifier as it appears in the source data.

    Conventions

    *
    Valid Procedure Concepts belong to the “Procedure” domain. Procedure Concepts are based on a variety of vocabularies: SNOMED-CT, ICD-9-Proc, CPT-4, HCPCS and OPCS-4, but also atypical Vocabularies such as ICD-9-CM or MedDRA.
    • Procedures are expected to be carried out within one day and therefore have no end date.
    • Procedures could involve the application of a drug, in which case the procedural component is recorded in the procedure table and simultaneously the administered drug in the drug exposure table when both the procedural component and drug are identifiable.
    • If the quantity value is omitted, a single procedure is assumed.
    • The Procedure Type defines from where the Procedure Occurrence is drawn or inferred. For administrative claims records the type indicates whether a Procedure was primary or secondary and their relative positioning within a claim.
    • The Visit during which the procedure was performed is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
    • The Provider carrying out the procedure is recorded through a reference to the PROVIDER table. This information is not always available.

    wanghaisheng commented 9 years ago

    DEVICE_EXPOSURE table

    The device exposure domain captures information about a person’s exposure to a foreign physical object or instrument that which is used for diagnostic or therapeutic purposes through a mechanism beyond chemical action. Devices include implantable objects (e.g. pacemakers, stents, artificial joints), medical equipment and supplies (e.g. bandages, crutches, syringes), other instruments used in medical procedures (e.g. sutures, defibrillators) and material used in clinical care (e.g. adhesives, body material, dental material, surgical material).
    FieldRequiredTypeDescription
    device_exposure_idYesintegerA system-generated unique identifier for each Device Exposure.
    person_idYesintegerA foreign key identifier to the Person who is subjected to the Device. The demographic details of that person are stored in the Person table.
    device_concept_idYesintegerA foreign key that refers to a Standard Concept identifier in the Standardized Vocabularies for the Device concept.
    device_exposure_start_dateYesdateThe date the Device or supply was applied or used.
    device_exposure_end_dateNodateThe date the Device or supply was removed from use.
    device_type_concept_idYesintegerA foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the type of Device Exposure recorded. It indicates how the Device Exposure was represented in the source data.
    unique_device_id Novarchar(50)A UDI or equivalent identifying the instance of the Device used in the Person.
    quantityNointegerThe number of individual Devices used for the exposure.
    provider_idNointegerA foreign key to the provider in the PROVIDER table who initiated of administered the Device.
    visit_occurrence_idNointegerA foreign key to the visit in the VISIT table during which the device was used.
    device_source_valueNovarchar(50)The source code for the Device as it appears in the source data. This code is mapped to a standard Device Concept in the Standardized Vocabularies and the original code is stored here for reference.
    device_source_ concept_idNointegerA foreign key to a Device Concept that refers to the code used in the source.

    Conventions

    *
    The distinction between Devices or supplies and procedures are sometimes blurry, but the former are physical objects while the latter are actions, often to apply a Device or supply.
    • For medical devices that are regulated by the FDA, if a Unique Device Identification (UDI) is provided if available in the data source, and is recorded in the unique_device_id field.
    • Valid Device Concepts belong to the “Device” domain. The Concepts of this domain are derived from the DI portion of a UDI or based on other source vocabularies, like HCPCS.
    • A Device Type is assigned to each Device Exposure to track from what source the information was drawn or inferred. The valid domain_id for these Concepts is “Device Type”.
    • The Visit during which the Device was first used is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
    • The Provider exposing the patient to the Device is recorded through a reference to the PROVIDER table. This information is not always available.
    wanghaisheng commented 9 years ago

    CONDITION_OCCURRENCE table

    Conditions are records of a Person suggesting the presence of a disease or medical condition stated as a diagnosis, a sign or a symptom, which is either observed by a Provider or reported by the patient. Conditions are recorded in different sources and levels of standardization, for example: *
    Medical claims data include diagnoses coded in ICD-9-CM that are submitted as part of a reimbursement claim for health services and
    • EHRs may capture Person Conditions in the form of diagnosis codes or symptoms.
      FieldRequiredTypeDescription
      condition_occurrence_idYesintegerA unique identifier for each Condition Occurrence event.
      person_idYesintegerA foreign key identifier to the Person who is experiencing the condition. The demographic details of that Person are stored in the PERSON table.
      condition_concept_idYesintegerA foreign key that refers to a Standard Condition Concept identifier in the Standardized Vocabularies.
      condition_start_dateYesdateThe date when the instance of the Condition is recorded.
      condition_end_dateNodateThe date when the instance of the Condition is considered to have ended.
      condition_type_concept_idYesintegerA foreign key to the predefined Concept identifier in the Standardized Vocabularies reflecting the source data from which the condition was recorded, the level of standardization, and the type of occurrence.
      stop_reasonNovarchar(20)The reason that the condition was no longer present, as indicated in the source data.
      provider_idNointegerA foreign key to the Provider in the PROVIDER table who was responsible for capturing (diagnosing) the Condition.
      visit_occurrence_idNointegerA foreign key to the visit in the VISIT table during which the Condition was determined (diagnosed).
      condition_source_valueNovarchar(50)The source code for the condition as it appears in the source data. This code is mapped to a standard condition concept in the Standardized Vocabularies and the original code is stored here for reference.
      condition_source_concept_idNointegerA foreign key to a Condition Concept that refers to the code used in the source.

    Conventions

    *
    Valid Condition Concepts belong to the “Condition” domain.
    • Condition records are typically inferred from diagnostic codes recorded in the source data. Such code system, like ICD-9-CM, ICD-10-CM, Read etc., provide a comprehensive coverage of conditions. However, if the diagnostic code in the source does not define a condition, but rather an observation or a procedure, then such information is not stored in the CONDITION_OCCURRENCE table, but in the respective tables instead.
    • Source Condition identifiers are mapped to Standard Concepts for Conditions in the Standardized Vocabularies. When the source code cannot be translated into a Standard Concept, a CONDITION_OCCURRENCE entry is stored with only the corresponding source_concept_id and source_value, while the condition_concept_id is set to 0.
    • Family history and past diagnoses (“history of”) are not recorded in the CONDITION_OCCURRENCE table. Instead, they are listed in the OBSERVATION table.
    • Codes written in the process of establishing the diagnosis, such as “question of” of and “rule out”, are not represented here. Instead, they are listed in the OBSERVATION table, if they are used for analyses.
    • A Condition Occurrence Type is assigned based on the data source and type of condition attribute, for example:
      • ICD-9-CM Primary Diagnosis from inpatient and outpatient Claims
        • ICD-9-CM Secondary Diagnoses from inpatient and outpatient Claims
        • Diagnoses or problems recorded in an EHR.
    • The Stop Reason indicates why a Condition is no longer valid with respect to the purpose within the source data. Typical values include “Discharged”, “Resolved”, etc. Note that a Stop Reason does not necessarily imply that the condition is no longer occurring.
    • Condition source codes are typically ICD-9-CM, Read or ICD-10 diagnosis codes from medical claims or discharge status/visit diagnosis codes from EHRs.

    wanghaisheng commented 9 years ago

    MEASUREMENT table

    The MEASUREMENT table contains records of Measurement, i.e. structured values (numerical or categorical) obtained through systematic and standardized examination or testing of a Person or Person's sample. The MEASUREMENT table contains both orders and results of such Measurements as laboratory tests, vital signs, quantitative findings from pathology reports, etc.
    FieldRequiredTypeDescription
    measurement_idYesintegerA unique identifier for each Measurement.
    person_idYesintegerA foreign key identifier to the Person about whom the measurement was recorded. The demographic details of that Person are stored in the PERSON table.
    measurement_concept_idYesintegerA foreign key to the standard measurement concept identifier in the Standardized Vocabularies.
    measurement_dateYesdateThe date of the Measurement.
    measurement_timeNotimeThe time of the Measurement.
    measurement_type_concept_idYesintegerA foreign key to the predefined Concept in the Standardized Vocabularies reflecting the provenance from where the Measurement record was recorded.
    operator_concept_idNointegerA foreign key identifier to the predefined Concept in the Standardized Vocabularies reflecting the mathematical operator that is applied to the value_as_number. Operators are <, ≤, =, ≥, >.
    value_as_numberNofloatA Measurement result where the result is expressed as a numeric value.
    value_as_concept_idNointegerA foreign key to a Measurement result represented as a Concept from the Standardized Vocabularies (e.g., positive/negative, present/absent, low/high, etc.).
    unit_concept_idNointegerA foreign key to a Standard Concept ID of Measurement Units in the Standardized Vocabularies.
    range_lowNofloatThe lower limit of the normal range of the Measurement result. The lower range is assumed to be of the same unit of measure as the Measurement value.
    range_highNofloatThe upper limit of the normal range of the Measurement. The upper range is assumed to be of the same unit of measure as the Measurement value.
    provider_idNointegerA foreign key to the provider in the PROVIDER table who was responsible for initiating or obtaining the measurement.
    visit_occurrence_idNointegerA foreign key to the Visit in the VISIT_OCCURRENCE table during which the Measurement was recorded.
    measurement_source_valueNovarchar(50)The Measurement name as it appears in the source data. This code is mapped to a Standard Concept in the Standardized Vocabularies and the original code is stored here for reference.
    measurement_source_concept_idNointegerA foreign key to a Concept in the Standard Vocabularies that refers to the code used in the source.
    unit_source_valueNovarchar(50)The source code for the unit as it appears in the source data. This code is mapped to a standard unit concept in the Standardized Vocabularies and the original code is stored here for reference.
    value_source_valueNovarchar(50)The source value associated with the content of the value_as_number or value_as_concept as stored in the source data.

    Conventions

    *
    Measurements differ from Observations in that they require a standardized test or some other activity to generate a quantitative or qualitative result. For example, LOINC 1755-8 concept_id 3027035 'Albumin [Mass/time] in 24 hour Urine' is the lab test to measure a certain chemical in a urine sample.
    • Even though each Measurement requires to have a result, the fields value_as_number and value_as_concept_id are not mandatory. When the result is missing, the Measurement record might represent an order or the result was not captured in the source data (missing data).
    • Valid Measurement Concepts (measurement_concept_id) belong to the 'Measurement' domain, but could overlap with the 'Observation' domain. This is due to the fact that there is a continuum between systematic examination or testing (Measurement) and a simple determination of fact (Observation).
    • Measurements are stored as attribute value pairs, with the attribute as the Measurement Concept and the value representing the result. The value can be a Concept (stored in value_as_concept), or a numerical value (value_as_number) with a Unit (unit_concept_id).
    • Valid Concepts for the value_as_concept field belong to the 'Meas Value' domain.
    • For some Measurement Concepts, the result is included in the test. For example, SNOMED-CT 71628007 concept_id 4325125 'Abnormal presence of albumin' indicates a Measurement (albumin in sample) and the result (abnormal). In future releases of the Standardized Vocabularies, these will be split into the correct Measurement and result.
    • The operator_concept_id is optionally given for relative Measurements where the precise value is not available but its relation to a certain benchmarking value is. For example, this can be used for minimal detection thresholds of a test.
    • The meaning of Concept 4172703 for '=' is identical to omission of a operator_concept_id value. Since the use of this field is rare, it's important when devising analyses to not to forget testing for the content of this field for values different from =.
    • Valid Concepts for the operator_concept_id field belong to the 'Meas Value Operator' domain.
    • The Unit is optional even if a value_as_number is provided.
    • If reference ranges for upper and lower limit of normal as provided (typically by a laboratory) these are stored in the range_high and range_low fields. Ranges have the same unit as the value_as_number.
    • The Visit during which the observation was made is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
    • The Provider making the observation is recorded through a reference to the PROVIDER table. This information is not always available.

    wanghaisheng commented 9 years ago

    NOTE table

    The NOTE table captures unstructured information that was recorded by a provider about a patient in free text notes on a given date.
    FieldRequiredTypeDescription
    note_idYesintegerA unique identifier for each note.
    person_idYesintegerA foreign key identifier to the Person about whom the Note was recorded. The demographic details of that Person are stored in the PERSON table.
    note_date YesdateThe date the note was recorded.
    note_timeNotimeThe time the note was recorded.
    note_type_concept_idYesintegerA foreign key to the predefined Concept in the Standardized Vocabularies reflecting the type, origin or provenance of the Note.
    note_textYesCLOBThe content of the Note.
    provider_idNointegerA foreign key to the Provider in the PROVIDER table who took the Note.
    note_source_valueNovarchar(50)The source value associated with the origin of the Note, as standardized using the note_tyype_concept_id
    visit_occurrence_idNointegerForeign key to the Visit in the VISIT_OCCURRENCE table when the Note was taken.

    Conventions

    *
    The NOTE table contains free text (in ASCII, or preferably in UTF8 format) taken by a healthcare Provider.
    • The Visit during which the note was written is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
    • The Provider making the note is recorded through a reference to the PROVIDER table. This information is not always available.

                    <!-- wikipage stop -->
                                    </div>
    wanghaisheng commented 9 years ago

    OBSERVATION table

    The OBSERVATION table captures clinical facts about a Person obtained in the context of examination, questioning or a procedure. Any data that cannot be represented by any other domains, such as social and lifestyle facts, medical history, family history, etc. are recorded here.
    FieldRequiredTypeDescription
    observation_idYesintegerA unique identifier for each observation.
    person_idYesintegerA foreign key identifier to the Person about whom the observation was recorded. The demographic details of that Person are stored in the PERSON table.
    observation_concept_idYesintegerA foreign key to the standard observation concept identifier in the Standardized Vocabularies.
    observation_dateYesdateThe date of the observation.
    observation_timeNotimeThe time of the observation.
    observation_type_concept_idYesintegerA foreign key to the predefined concept identifier in the Standardized Vocabularies reflecting the type of the observation.
    value_as_numberNofloatThe observation result stored as a number. This is applicable to observations where the result is expressed as a numeric value.
    value_as_stringNovarchar(60)The observation result stored as a string. This is applicable to observations where the result is expressed as verbatim text.
    value_as_concept_idNoIntegerA foreign key to an observation result stored as a Concept ID. This is applicable to observations where the result can be expressed as a Standard Concept from the Standardized Vocabularies (e.g., positive/negative, present/absent, low/high, etc.).
    qualifier_concept_idNointegerA foreign key to a Standard Concept ID for a qualifier (e.g., severity of drug-drug interaction alert)
    unit_concept_idNointegerA foreign key to a Standard Concept ID of measurement units in the Standardized Vocabularies.
    provider_idNointegerA foreign key to the provider in the PROVIDER table who was responsible for making the observation.
    visit_occurrence_idNointegerA foreign key to the visit in the VISIT_OCCURRENCE table during which the observation was recorded.
    observation_source_valueNovarchar(50)The observation code as it appears in the source data. This code is mapped to a Standard Concept in the Standardized Vocabularies and the original code is, stored here for reference.
    observation_source_concept_idNointegerA foreign key to a Concept that refers to the code used in the source.
    unit_source_valueNovarchar(50)The source code for the unit as it appears in the source data. This code is mapped to a standard unit concept in the Standardized Vocabularies and the original code is, stored here for reference.
    qualifier_source_valueNovarchar(50)The source value associated with a qualifier to characterize the observation

    Conventions

    *
    Observations differ from Measurements in that they do not require a standardized test or some other activity to generate clinical fact. Typical observations are medical history, family history, the stated need for certain treatment, social circumstances, lifestyle choices, healthcare utilization patterns, etc. If the generation clinical facts requires a standardized testing such as lab testing or imaging and leeds to a standardized result, the data item is recorded in the MEASUREMENT table. If the clinical fact observed determines a sign, symptom, diagnosis of a disease or other medical condition, it is recorded in the CONDITION_OCCURRENCE table.
    • Valid Observation Concepts are not enforced to be from any domain. They still should be Standard Concepts, and they typically belong to the 'Observation' or sometimes 'Measurement' domain.
    • Observation can be stored as attribute value pairs, with the attribute as the Observation Concept and the value representing the clinical fact. This fact can be a Concept (stored in value_as_concept), a numerical value (value_as_number) or a verbatim string (value_as_string). Even though Observations do not have an explicit result, the clinical fact can be stated separately from the type of Observation in the value_as_ fields.
    • It is recommended for observations that are suggestive statements of positive assertion should have a value of 'Yes' (concept_id=4188539), recorded, even though the null value is the equivalent.
    • Valid Concepts of the value_as_concept field are not enforced, but typically belong to the 'Meas Value' domain.
    • For numerical facts a Unit can be provided in the unit_concept_id.
    • For facts represented as Concepts no domain membership is enforced.
    • The qualifier_concept_id field contains all attributes specifying the clinical fact further, such as as degrees, severities, drug-drug interaction alerts etc.
    • The Visit during which the observation was made is recorded through a reference to the VISIT_OCCURRENCE table. This information is not always available.
    • The Provider making the observation is recorded through a reference to the PROVIDER table. This information is not always available.
    wanghaisheng commented 9 years ago

    FACT_RELATIONSHIP table

    The FACT_RELATIONSHIP table contains records about the relationships between facts stored as records in any table of the CDM. Relationships can be defined between facts from the same domain (table), or different domains. Examples of Fact Relationships include: Person relationships (parent-child), care site relationships (hierarchical organizational structure of facilities within a health system), indication relationship (between drug exposures and associated conditions), usage relationships (of devices during the course of an associated procedure), or facts derived from one another (measurements derived from an associated specimen).
    FieldRequiredTypeDescription
    domain_concept _id_1YesintegerThe concept representing the domain of fact one, from which the corresponding table can be inferred.
    fact_id_1YesintegerThe unique identifier in the table corresponding to the domain of fact one.
    domain_concept_id_2YesintegerThe concept representing the domain of fact two, from which the corresponding table can be inferred.
    fact_id_2YesintegerThe unique identifier in the table corresponding to the domain of fact two.
    relationship_concept_id YesintegerA foreign key to a Standard Concept ID of relationship in the Standardized Vocabularies.

    Conventions

    *
    All relationships are directional, and each relationship is represented twice symmetrically within the FACT_RELATIONSHIP table. For example, two persons if person_id = 1 is the mother of person_id = 2 two records are in the FACT_RELATIONSHIP table (all strings in fact concept_id records in the Concept table:
    *   <div class="li"> Person, 1, Person, 2, parent of</div>
    
        *   <div class="li"> Person, 2, Person, 1, child of</div>
    wanghaisheng commented 9 years ago

    第四类 Standardized Health System Data Tables

    wanghaisheng commented 9 years ago

    LOCATION table

    The LOCATION table represents a generic way to capture physical location or address informationof Persons and Care Sites.
    FieldRequiredTypeDescription
    location_idYesintegerA unique identifier for each geographic location.
    address_1Novarchar(50)The address field 1, typically used for the street address, as it appears in the source data.
    address_2Novarchar(50)The address field 2, typically used for additional detail such as buildings, suites, floors, as it appears in the source data.
    city Novarchar(50)The city field as it appears in the source data.
    stateNovarchar(2)The state field as it appears in the source data.
    zipNovarchar(9)The zip or postal code.
    countyNovarchar(20)The county.
    location_source_valueNovarchar(50)The verbatim information that is used to uniquely identify the location as it appears in the source data.

    Conventions

    *
    Each address or Location is unique and is present only once in the table.
    • Locations do not contain names, such as the name of a hospital. In order to construct a full address that can be used in the postal service, the address information from the Location needs to be combined with information from the Care Site. The PERSON table does not contain name information at all.
    • All fields in the Location tables contain the verbatim data in the source, no mapping or normalization takes place. None of the fields are mandatory. If the source data have no Location information at all, all Locations are represented by a single record. Typically, source data contain full or partial zip or postal codes or county or census district information.
    • Zip codes are handled as strings of up to 9 characters length. For US addresses, these represent either a 3-digit abbreviated Zip code as provided by many sources for patient protection reasons, the full 5-digit Zip or the 9-digit (ZIP + 4) codes. Unless for specific reasons analytical methods should expect and utilize only the first 3 digits. For international addresses, different rules apply.
    • The county information can be provided and is not redundant with information from the zip codes as not all of these have an unambiguous county designation.
    • No country information is expected as source data are always collected within a single country.

    wanghaisheng commented 9 years ago

    CARE_SITE table

    The CARE_SITE table contains a list of uniquely identified institutional (physical or organizational) units where healthcare delivery is practiced (offices, wards, hospitals, clinics, etc.).
    FieldRequiredTypeDescription
    care_site_idYesintegerA unique identifier for each Care Site.
    care_site_nameNovarchar(255)The description or name of the Care Site
    place_of_service_concept_idNointegerA foreign key that refers to a Place of Service Concept ID in the Standardized Vocabularies.
    location_idNointegerA foreign key to the geographic Location of the Care Site in the LOCATION table, where the detailed address information is stored.
    care_site_source_valueNovarchar(50)The identifier for the Care Site in the source data, stored here for reference.
    place_of_service_source_valueNovarchar(50)The source code for the Place of Service as it appears in the source data, stored here for reference.

    Conventions

    *
    Many sources do not make a distinction between individual and institutional providers. The CARE_SITE table contains the institutional providers.
    • If the source, instead of uniquely identifying individual Care Sites, only provides limited information such as Place of Service, generic or “pooled” Care Site records are listed in the CARE_SITE table.
    • There are hierarchical and business relationships between Care Sites. For example,wards can belong to clinics or departments, which can in turn belong to hospitals, which in turn can belong to hospital systems, which in turn can belong to HMOs.
    • The relationships between Care Sites are defined in the FACT_RELATIONSHIP table.
    • The Care Site Source Value typically contains the name of the Care Site.
    • The Place of Service Concepts belongs to the Domain 'Place of Service'.

    wanghaisheng commented 9 years ago

    PROVIDER table

    The PROVIDER table contains a list of uniquely identified healthcare providers. These are individuals providing hands-on healthcare to patients, such as physicians, nurses, midwives, physical therapists etc.
    FieldRequiredTypeDescription
    provider_idYesintegerA unique identifier for each Provider.
    provider_nameNovarchar(50)A description of the Provider.
    npiNovarchar(20)The National Provider Identifier (NPI) of the provider.
    deaNovarchar(20)The Drug Enforcement Administration (DEA) number of the provider.
    specialty_concept_idNointegerA foreign key to a Standard Specialty Concept ID in the Standardized Vocabularies.
    care_site_idNointegerA foreign key to the main Care Site where the provider is practicing.
    year_of_birthNointegerThe year of birth of the Provider.
    gender_concept_idNointegerThe gender of the Provider.
    provider_source_valueNovarchar(50)The identifier used for the Provider in the source data, stored here for reference.
    specialty_source_valueNovarchar(50)The source code for the Provider specialty as it appears in the source data, stored here for reference.
    specialty_source_concept_idNointegerA foreign key to a Concept that refers to the code used in the source.
    gender_source_valueNovarchar(50)The gender code for the Provider as it appears in the source data, stored here for reference.
    gender_source_concept_idNointegerA foreign key to a Concept that refers to the code used in the source.

    Conventions

    *
    Many sources do not make a distinction between individual and institutional providers. The PROVIDER table contains the individual providers.
    • If the source, instead of uniquely identifying individual providers, only provides limited information such as specialty, generic or “pooled” Provider records are listed in the PROVIDER table.
    • A single Provider cannot be listed twice (be duplicated) in the table. If a Provider has more than one Specialty, the main or most often exerted specialty should be recorded.
    • Valid Specialty Concepts belong to the 'Specialty' domain.
    • The care_site_id represent a fixed relationship between a Provider and her main Care Site. Providers are also linked to Care Sites through Condition, Procedure and Visit records.

    wanghaisheng commented 9 years ago

    第五类 Standardized Health Economics Data Tables

    wanghaisheng commented 9 years ago

    PAYER_PLAN_PERIOD table

    The PAYER_PLAN_PERIOD table captures details of the period of time that a Person is continuously enrolled under a specific health Plan benefit structure from a given Payer. Each Person receiving healthcare is typically covered by a health benefit plan, which pays for (fully or partially), or directly provides, the care. These benefit plans are provided by payers, such as health insurances or state or government agencies. In each plan the details of the health benefits are defined for the Person or her family, and the health benefit Plan might change over time typically with increasing utilization (reaching certain cost thresholds such as deductibles), plan availability and purchasing choices of the Person. The unique combinations of Payer organizations, health benefit Plans and time periods in which they are valid for a Person are recorded in this table.
    FieldRequiredTypeDescription
    payer_plan_period_idYesintegerA identifier for each unique combination of payer, plan, family code and time span.
    person_idYesintegerA foreign key identifier to the Person covered by the payer. The demographic details of that Person are stored in the PERSON table.
    payer_plan_period_start_dateYesdateThe start date of the payer plan period.
    payer_plan_period_end_dateYesdateThe end date of the payer plan period.
    payer_source_valueNovarchar(50)The source code for the payer as it appears in the source data.
    plan_source_valueNovarchar(50)The source code for the Person's health benefit plan as it appears in the source data.
    family_source_valueNovarchar(50)The source code for the Person's family as it appears in the source data.

    Conventions

    *
    Different Payers have different designs for their health benefit Plans. The PAYER_PLAN_PERIOD table does not capture all details of the plan design or the relationship between Plans or the cost of healthcare triggering a change from one Plan to another. However, it allows identifying the unique combination of Payer (insurer), Plan (determining healthcare benefits and limits) and Person. Typically, depending on healthcare utilization, a Person may have one or many subsequent Plans during coverage by a single Payer.
    • Payer or Plan information is not normalized or recorded as part of the Standard Vocabularies. Instead, they are referred to only by their source_value.
    • Typically, family members are covered under the same Plan as the Person. In those cases, the payer_source_value, plan_source_value and family_source_value are identical.

    wanghaisheng commented 9 years ago

    VISIT_COST table

    The VISIT_COST table captures the cost of a Visit of a Person not itemized to specific procedures, drugs, or devices used during the Visit.
    FieldRequiredTypeDescription
    visit_cost_idYesintegerA unique identifier for each procedure cost record.
    visit_occurrence_idYesintegerA foreign key identifier to the procedure record for which cost data are recorded.
    currency_concept_idNointegerA concept representing the 3-letter code used to delineate international currencies, such as USD for US Dollar.
    paid_copayNofloatThe amount paid by the Person as a fixed contribution to the expenses. Copay does not contribute to the out_of_pocket expenses.
    paid_coinsuranceNofloatThe amount paid by the Person as a joint assumption of risk. Typically, this is a percentage of the expenses defined by the Health benefit Plan after the person's deductible is exceeded.
    paid_toward_ deductibleNofloatThe amount paid by the Person that is counted toward the deductible defined by the health benefit Plan.
    paid_by_payerNofloatThe amount paid by the Payer (insurer). If there is more than one Payer, several VISIT_COST records indicate that fact.
    paid_by_coordination_benefitsNofloatThe amount paid by a secondary Payer through the coordination of benefits.
    total_out_of_pocketNofloatThe total amount paid by the Person as a share of the expenses, excluding the copay.
    total_paidNofloatThe total amount paid for the expenses of the procedure.
    payer_plan_period_idNointegerA foreign key to the PAYER_PLAN_PERIOD table, where the details of the Payer, Plan and Family are stored.

    Conventions

    *
    The cost of the Visit may contain just board and food, but could also include the entire cost of everything that was happening to the patient during the Visit.
    • All other conventions apply as in the PROCEDURE_COST table.

    wanghaisheng commented 9 years ago

    PROCEDURE_COST table

    The PROCEDURE_COST table captures the cost of a Procedure performed on a Person. The information about the cost is only derived from the amount of money paid for the Procedure. This is in contrast to the Drug Cost data which also contain information about true amount charged by the distributor. In addition, Revenue codes are captured.
    FieldRequiredTypeDescription
    procedure_cost_idYesintegerA unique identifier for each procedure cost record.
    procedure_occurrence_idYesintegerA foreign key identifier to the procedure record for which cost data are recorded.
    currency_concept_idNointegerA concept representing the 3-letter code used to delineate international currencies, such as USD for US Dollar.
    paid_copayNofloatThe amount paid by the Person as a fixed contribution to the expenses. Copay does not contribute to the out_of_pocket expenses.
    paid_coinsuranceNofloatThe amount paid by the Person as a joint assumption of risk. Typically, this is a percentage of the expenses defined by the health benefit Plan after the Person's deductible is exceeded.
    paid_toward_deductibleNofloatThe amount paid by the Person that is counted toward the deductible defined by the health benefit Plan.
    paid_by_payerNofloatThe amount paid by the Payer. If there is more than one Payer, several PROCEDURE_COST records indicate that fact.
    paid_by_coordination_benefitsNofloatThe amount paid by a secondary Payer through the coordination of benefits.
    total_out_of_pocketNofloatThe total amount paid by the Person as a share of the expenses
    total_paidNofloatThe total amount paid for the expenses of the Procedure.
    revenue_code_concept_idNointegerA foreign key referring to a Standard Concept ID in the Standardized Vocabularies for Revenue codes.
    payer_plan_period_idNointegerA foreign key to the PAYER_PLAN_PERIOD table, where the details of the payer, plan and family are stored.
    revenue_code_source_valueNovarchar(50)The source code for the Revenue code as it appears in the source data, stored here for reference.

    Conventions

    *
    Each Procedure Occurrence may have any number of corresponding records in the PROCEDURE_COST table, but often it is none (cost data not captured) or one (one payment per Procedure). They are linked directly through the Procedure Occurrence ID field.
    • If Procedures payments are bundled and the cost of such a bundle might be represented in only one of the component Procedures. The FACT_RELATIONSHIP table contains the relationship between the charged Procedure and the Procedure belonging to the bundle.
    • The amounts paid are:
      • Copay – a fixed amount to be paid by the Person
        • Coinsurance – a relative amount of the total paid by the Person
        • Deductible – an amount of money paid by the Person before the Payer starts contributing
        • Primary Payer – the amount the primary Payer pays towards the total
        • Coordination of Benefits – the amount a secondary Payer or Family Plan pays towards the total
        • Out of Pocket = Copay + Coinsurance + Deductible
        • Total – the total amount paid for the Procedure
    • The amounts in various payment components should equal the total, so Copay + Coinsurance + Deductible + Primary Payer + COB = Total Paid. In reality, this is not always reflected in the source data. It is up to the ETL to determine how to deal with quality problems in the data.
    • The revenue_code_concept_id determines what service within a provider is charging for the service

    wanghaisheng commented 9 years ago

    DRUG_COST table

    The DRUG_COST table captures records containing the cost of a Drug Exposure. The information about the cost is defined by the amount of money paid by the Person and Payer for the Drug, as well as the charged cost of the Drug. In addition, a reference to the health plan information in the PAYER_PLAN_PERIOD table is stored in the record that is responsible for the determination of the cost as well as some of the Payments.
    FieldRequiredTypeDescription
    drug_cost_idYesintegerA unique identifier for each DRUG_COST record.
    drug_exposure_idYesintegerA foreign key identifier to the Drug record for which cost data are recorded.
    currency_concept_idNointegerA concept representing the 3-letter code used to delineate international currencies, such as USD for US Dollar.
    paid_copayNofloatThe amount paid by the Person as a fixed contribution to the expenses. Copay does not contribute to the out of pocket expenses.
    paid_coinsuranceNofloatThe amount paid by the Person as a joint assumption of risk. Typically, this is a percentage of the expenses defined by the Payer Plan after the Person's deductible is exceeded.
    paid_toward_deductibleNofloatThe amount paid by the Person that is counted toward the deductible defined by the Payer Plan.
    paid_by_payerNofloatThe amount paid by the Payer. If there is more than one Payer, several DRUG_COST records indicate that fact.
    paid_by_coordination_benefitsNofloatThe amount paid by a secondary Payer through the coordination of benefits.
    total_out_of_pocketNofloatThe total amount paid by the Person as a share of the expenses.
    total_paidNofloatThe total amount paid for the expenses of drug exposure.
    ingredient_costNofloatThe portion of the drug expenses due to the cost charged by the manufacturer for the drug, typically a percentage of the Average Wholesale Price.
    dispensing_feeNofloatThe portion of the drug expenses due to the dispensing fee charged by the pharmacy, typically a fixed amount.
    average_wholesale_priceNofloatList price of a Drug set by the manufacturer.
    payer_plan_period_idNointegerA foreign key to the PAYER_PLAN_PERIOD table, where the details of the Payer, Plan and Family are stored.

    Conventions

    *
    Each Drug Exposure may have any number of corresponding records in the DRUG_COST table, but usually it is none (no cost data recorded) or one. They are linked directly through the drug_exposure_id field.
    • The amounts paid are:
      • Copay – a fixed amount to be paid by the Person
        • Coinsurance – a relative amount of the total paid by the Person
        • Deductible – an amount of money paid by the Person before the Payer starts contributing
        • Primary Payer – the amount the primary Payer pays towards the total
        • Coordination of Benefits – the amount a secondary Payer or Family Plan pays towards the total
        • Out of Pocket = Copay + Coinsurance + Deductible
        • Total – the total amount paid for the Drug Exposure
    • Drug costs are:
      • Ingredient Cost – the amount charged by the wholesale distributor or manufacturer
        • Dispensing Fee – the amount charged by the pharmacy
        • Sales Tax. This is usually very small and typically not provided by most source data, and therefore not included in the CDM
    • The amount paid should equal the cost, so Copay + Coinsurance + Deductible + Primary Payer + Coordination of Benefits = Total Paid = Ingredient Cost + Dispensing Fee. In reality, this is not always reflected in the source data. It is up to the ETL to determine how to deal with quality problems in the data.
    • The Average Wholesale Price is the list price of the drug, but not the price actually charged or paid.

    wanghaisheng commented 9 years ago

    DEVICE_COST table

    The DEVICE_COST table captures the cost of a medical Device or supply used on a Person. The information about the cost is only derived from the amounts paid for the device.
    FieldRequiredTypeDescription
    device_cost_idYesintegerA unique identifier for each DEVICE_COST record.
    device_exposure_ idYesintegerA foreign key identifier to the DEVICE_EXPOSURE record for which cost data are recorded.
    currency_concept_idNointegerA concept representing the 3-letter code used to delineate international currencies, such as USD for US Dollar.
    paid_copayNofloatThe amount paid by the Person as a fixed contribution to the expenses. Copay does not contribute to the out_of_pocket expenses.
    paid_coinsuranceNofloatThe amount paid by the Person as a joint assumption of risk. Typically, this is a percentage of the expenses defined by the Payer Plan after the person's deductible is exceeded.
    paid_toward_ deductibleNofloatThe amount paid by the Person that is counted toward the deductible defined by the Payer Plan.
    paid_by_payerNofloatThe amount paid by the Payer. If there is more than one payer, several procedure_cost records indicate that fact.
    paid_by_coordination_benefitsNofloatThe amount paid by a secondary payer through the coordination of benefits.
    total_out_of_pocketNofloatThe total amount paid by the Person as a share of the expenses, excluding the copay.
    total_paidNofloatThe total amount paid for the expenses of the procedure.
    payer_plan_period_idNointegerA foreign key to the payer_plan_period table, where the details of the payer, plan and family are stored.

    Conventions

    *
    If the Device is derived from a Procedure record, all conventions apply to the field’s equivalent to the procedure_cost (see above).
    wanghaisheng commented 9 years ago

    第六类:Standardized Derived Elements

    wanghaisheng commented 9 years ago

    COHORT table

    The COHORT table contains records of subjects that satisfy a given set of criteria for a duration of time. The definition of the cohort is contained within the COHORT_DEFINITION table. Cohorts can be constructed of patients (Persons), Providers or Visits.
    FieldRequiredTypeDescription
    cohort_definition_idYesintegerA foreign key to a record in the COHORT_DEFINITION table containing relevant Cohort Definition information.
    subject_idYesintegerA foreign key to the subject in the cohort. These could be referring to records in the PERSON, PROVIDER, VISIT_OCCURRENCE table.
    cohort_start_dateYesdateThe date when the Cohort Definition criteria for the Person, Provider or Visit first match.
    cohort_end_dateYesdateThe date when the Cohort Definition criteria for the Person, Provider or Visit no longer match or the Cohort membership was terminated.

    Conventions

    *
    The core of a Cohort is the unifying definition or feature of the Cohort. This is captured in the cohort_definition_id. For example, Cohorts can include patients diagnosed with a specific condition, patients exposed to a particular drug, or Providers who have performed a specific Procedure.
    • Cohort records must have a Start Date
    • Cohort records must have an End Date, but may be set to Start Date or could have applied a censored date using the Observation Period Start Date.
    • Cohort records must contain a Subject Id, which can refer to the Person, Provider, or Visit record. The Cohort Definition will define the type of subject through the subject concept id.

    wanghaisheng commented 9 years ago

    COHORT_ATTRIBUTE table

    The COHORT_ATTRIBUTE table contains attributes associated with each subject within a cohort, as defined by a given set of criteria for a duration of time. The definition of the Cohort Attribute is contained in the ATTRIBUTE_DEFINITION table.
    FieldRequiredTypeDescription
    cohort_definition_idYesintegerA foreign key to a record in the [COHORT_DEFINITION](/web/wiki/doku.php?id=documentation:cdm:cohort_definition "documentation:cdm:cohort_definition") table containing relevant Cohort Definition information.
    subject_idYesintegerA foreign key to the subject in the Cohort. These could be referring to records in the PERSON, PROVIDER, VISIT_OCCURRENCE table.
    cohort_start_dateYesdateThe date when the Cohort Definition criteria for the Person, Provider or Visit first match.
    cohort_end_dateYesdateThe date when the Cohort Definition criteria for the Person, Provider or Visit no longer match or the Cohort membership was terminated.
    attribute_definition_idYesintegerA foreign key to a record in the [ATTRIBUTE_DEFINITION](/web/wiki/doku.php?id=documentation:cdm:attribute_definition "documentation:cdm:attribute_definition") table containing relevant Attribute Definition information.
    value_as_numberNofloatThe attribute result stored as a number. This is applicable to attributes where the result is expressed as a numeric value.
    value_as_concept_idNointegerThe attribute result stored as a Concept ID. This is applicable to attributes where the result is expressed as a categorical value.

    Conventions

    *
    Each record in the COHORT_ATTRIBUTE table is linked to a specific record in the COHORT table, identified by matching cohort_definition_id, subject_id, cohort_start_date and cohort_end_date fields.
    • It adds to the Cohort records calculated co-variates (for example age, BMI) or composite scales (for example Charleson index).
    • The unifying definition or feature of the Cohort Attribute is captured in the attribute_definition_id referring to a record in the ATTRIBUTE_DEFINITION table.
    • The actual result or value of the Cohort Attribute (co-variate, index value) is captured in the value_as_number (if the value is numberic) or the value_as_concept_id (if the value is a concept) fields.

    wanghaisheng commented 9 years ago

    DRUG_ERA table

    A Drug Era is defined as a span of time when the Person is assumed to be exposed to a particular active ingredient. A Drug Era is not the same as a Drug Exposure: Exposures are individual records corresponding to the source when Drug was delivered to the Person, while successive periods of Drug Exposures are combined under certain rules to produce continuous Drug Eras.
    FieldRequiredTypeDescription
    drug_era_idYesintegerA unique identifier for each Drug Era.
    person_idYesintegerA foreign key identifier to the Person who is subjected to the Drug during the fDrug Era. The demographic details of that Person are stored in the PERSON table.
    drug_concept_idYesintegerA foreign key that refers to a Standard Concept identifier in the Standardized Vocabularies for the Ingredient Concept.
    drug_era_start_dateYesdateThe start date for the Drug Era constructed from the individual instances of Drug Exposures. It is the start date of the very first chronologically recorded instance of conutilization of a Drug.
    drug_era_end_dateYesdateThe end date for the drug era constructed from the individual instance of drug exposures. It is the end date of the final continuously recorded instance of utilization of a drug.
    drug_exposure_countNointegerThe number of individual Drug Exposure occurrences used to construct the Drug Era.
    gap_daysNointegerThe number of observed days of gap between Drug Exposure records which was tolerated in the building of the Drug Era record.

    Conventions

    *
    Drug Eras are derived from records in the DRUG_EXPOSURE table using a standardized algorithm.
    • Each Drug Era corresponds to one or many Drug Exposures that form a continuous interval and contain the same Drug Ingredient (active compound).
    • The drug_concept_id field only contains Concepts that have the concept_class 'Ingredient'. The Ingredient is derived from the Drugs in the DRUG_EXPOSURE table that are aggregated into the Drug Era record.
    • The Drug Era Start Date is the start date of the first Drug Exposure.
    • The Drug Era End Date is the end date of the last Drug Exposure. The End Date of each Drug Exposure is either taken from the field Drug Exposure End Date or, as it is typically not available, inferred using the following rules:
      • For pharmacy prescriptions claims, the date when the drug was dispensed and the number of days of supply are used to extrapolate the End Date for the Drug Exposure.
        • For Procedure Drugs, usually the drug is administered on a single date (i.e., the administration date).
    • The Gap Days determine how much drug-free days are tolerated between two Drug Exposures, without interrupting a Drug Era. Typically, this is set to 30, but could be different for Drugs that are very acute or chronic in their administration or for whom the daily dosing is variable.

    wanghaisheng commented 9 years ago

    DOSE_ERA table

    A Dose Era is defined as a span of time when the Person is assumed to be exposed to a constant dose of a specific active ingredient.
    FieldRequiredTypeDescription
    dose_era_idYesintegerA unique identifier for each Dose Era.
    person_idYesintegerA foreign key identifier to the Person who is subjected to the drug during the drug era. The demographic details of that Person are stored in the PERSON table.
    drug_concept_idYesintegerA foreign key that refers to a Standard Concept identifier in the Standardized Vocabularies for the active Ingredient Concept.
    unit_concept_idYesintegerA foreign key that refers to a Standard Concept identifier in the Standardized Vocabularies for the unit concept.
    dose_valueYesfloatThe numeric value of the dose.
    dose_era_start_dateYesdateThe start date for the drug era constructed from the individual instances of drug exposures. It is the start date of the very first chronologically recorded instance of utilization of a drug.
    dose_era_end_dateYesdateThe end date for the drug era constructed from the individual instance of drug exposures. It is the end date of the final continuously recorded instance of utilization of a drug.

    Conventions

    *
    Dose Eras will be derived from records in the DRUG_EXPOSURE table and the Dose information from the DRUG_DOSE table using a standardized algorithm.
    • Each Dose Era corresponds to one or many Drug Exposures that form a continuous interval and contain the same Drug Ingredient (active compound) at the same effective dose.
    • Dose Form information is not taken into account. So, if the patient changes between different formuations, or different manufacturers with the same formuation, the Dose Era is still spanning the entire time of exposure to the Ingredient.
    wanghaisheng commented 9 years ago

    CONDITION_ERA table

    A Condition Era is defined as a span of time when the Person is assumed to have a given condition. Similar to Drug Eras, Condition Eras are chronological periods of Condition Occurrence. Combining individual Condition Occurrences into a single Condition Era serves two purposes: *
    It allows aggregation of chronic conditions that require frequent ongoing care, instead of treating each Condition Occurrence as an independent event.
    • It allows aggregation of multiple, closely timed doctor visits for the same Condition to avoid double-counting the Condition Occurrences.

      For example, consider a Person who visits her Primary Care Physician (PCP) and who is referred to a specialist. At a later time, the Person visits the specialist, who confirms the PCP’s original diagnosis and provides the appropriate treatment to resolve the condition. These two independent doctor visits should be aggregated into one Condition Era.

      FieldRequiredTypeDescription
      condition_era_idYesintegerA unique identifier for each Condition Era.
      person_idYesintegerA foreign key identifier to the Person who is experiencing the Condition during the Condition Era. The demographic details of that Person are stored in the PERSON table.
      condition_concept_idYesintegerA foreign key that refers to a standard Condition Concept identifier in the Standardized Vocabularies.
      condition_era_start_dateYesdateThe start date for the Condition Era constructed from the individual instances of Condition Occurrences. It is the start date of the very first chronologically recorded instance of the condition.
      condition_era_end_dateYesdateThe end date for the Condition Era constructed from the individual instances of Condition Occurrences. It is the end date of the final continuously recorded instance of the Condition.
      condition_occurrence_countNointegerThe number of individual Condition Occurrences used to construct the condition era.

    Conventions

    *
    Condition Era records will be derived from the records in the CONDITION_OCCURRENCE table using a standardized algorithm.
    • Each Condition Era corresponds to one or many Condition Occurrence records that form a continuous interval.

      The condition_concept_id field contains Concepts that are identical to those of the CONDITION_OCCURRENCE table records that make up the Condition Era. In contrast to Drug Eras, Condition Eras are not aggregated to contain Conditions of different hierarchical layers. The Condition Era Start Date is the start date of the first Condition Occurrence. The Condition Era End Date is the end date of the last Condition Occurrence.