wangpinggl / TREQS

Text-to-SQL Generation for Question Answering on Electronic Medical Records
MIT License
120 stars 29 forks source link

Text-to-SQL Generation for Question Answering on Electronic Medical Records

image image image

Citation

Ping Wang, Tian Shi, and Chandan K. Reddy. "Text-to-SQL Generation for Question Answering on Electronic Medical Records." In Proceedings of The Web Conference 2020 (WWW’20), pp. 350-361, 2020.

@inproceedings{wang2020text,
  title={Text-to-SQL Generation for Question Answering on Electronic Medical Records},
  author={Wang, Ping and Shi, Tian and Reddy, Chandan K},
  booktitle={Proceedings of The Web Conference 2020},
  pages={350--361},
  year={2020}
}

Dataset

MIMICSQL is created based on the publicly available real-world de-identified Medical Information Mart for Intensive Care III (MIMIC III) dataset. In order to generated more realistic questions, each patient is randomly assigned a synthetic name, which should not be used to identify any patients.

{
  "key": "a81dae5ff42498734e857c5b7dc46deb",
  "format": {
    "table": [
      0,
      2
    ],
    "cond": [
      [
        0,
        6,
        0,
        "F"
      ],
      [
        2,
        3,
        0,
        "Abdomen artery incision"
      ]
    ],
    "agg_col": [
      [
        0,
        0
      ]
    ],
    "sel": 1
  },
  "question_refine": "how many female patients underwent the procedure of abdomen artery incision?",
  "sql": "SELECT COUNT ( DISTINCT DEMOGRAPHIC.\"SUBJECT_ID\" ) FROM DEMOGRAPHIC INNER JOIN PROCEDURES on DEMOGRAPHIC.HADM_ID = PROCEDURES.HADM_ID WHERE DEMOGRAPHIC.\"GENDER\" = \"F\" AND PROCEDURES.\"SHORT_TITLE\" = \"Abdomen artery incision\"",
  "question_refine_tok": [],
  "sql_tok": []
}

The meaning of each elements are as follows:

Usage

Evaluation

The codes for evaluation are provided in folder evaluation. You can follow the following steps to evaluate the generated queries.

Results

Here we provide the results on the new version of natural language questions provided in mimicsql_data/mimicsql_natual_v2.

Dataset Overall Evaluation Breakdown Evaluation
Acc_LFAcc_EXAgg_opAgg_colTableCon_col+opCon_valAverage
Testing0.4820.6110.9930.9700.9540.8570.6300.881
Testing+recover0.5470.6900.9920.9690.9530.8630.7290.901
Development0.4320.6360.9970.9880.9560.8450.5240.862
Development+recover0.5260.7410.9970.9880.9560.8370.6390.883