ResearchSoftwareInstitute / greendatatranslator

Green Team Data Translator Software Engineering and Development
BSD 3-Clause "New" or "Revised" License
2 stars 1 forks source link

Develop smartAPI for ICEES #121

Open stevencox opened 6 years ago

stevencox commented 6 years ago

Design and develop a smartAPI for the EBCR service in support of the overall clinical feature vector hackathon goal.

@karafecho @lstillwe @cbizon

karafecho commented 6 years ago

Security Features:

  1. Clinical feature tables are de-identified after the clinical data are integrated at the patient- and visit-level with several socioenvironmental data sources
  2. Clinical feature variables are binned or recoded
  3. Only aggregated counts of patients or visits are returned to users
  4. HIPAA-defined PHI elements are excluded from the clinical feature tables
  5. An error message is returned if the input feature variables identify a cohort of ≤10 patients (functionality 1)
  6. Users are not informed why the input features are invalid
  7. Cell sizes of ≤10 patients are returned as such (output for functionalities 3 and 4)
  8. Capitalizing on the fact that missing data points add “noise” to sample sizes
  9. Server is secure and housed at UNC/RENCI
  10. Clinical feature tables are encrypted via SSL
  11. Requests are restricted to <=10 per sec
  12. Text for DUA-like terms and conditions are returned to machine submitting the request:

"The Translator Data-Driven Clinical Regrouping (DDCR) Service is providing you with Data that have been de-identified in accordance with 45 C.F.R. §§ 164.514(a) and (b) and that UNC Health Care System (UNCHCS) is permitted to provide under 45 C.F.R. § 164.502(d)(2). Recipient agrees to notify UNCHCS via NC TraCS in the event that Recipient receives any identifiable data in error and to take such measures to return the identifiable data and/or destroy it at the direction of UNCHCS.

Restrictions on Recipient’s Use of Data. Recipient further agrees to use the data exclusively for the purposes and functionalities provided by the DDCR Service: cohort discovery; feature-rich cohort discovery; hypothesis-driven queries; and exploratory queries. Recipient agrees to use appropriate safeguards to protect the Data from misuse and unauthorized access or disclosure. Recipient will report to UNCHCS any unauthorized access, use, or disclosure of the Data not provided for by the Service of which Recipient becomes aware. Recipient will not attempt to identify the individuals whose information is contained in any Data transferred pursuant to this Service Agreement or attempt to contact those individuals. Recipient agrees not to sell the Data to any third party for any purpose. Recipient agrees not to disclose or publish the Data in any manner that would identify the Data as originating from UNCHCS. Finally, Recipient agrees to reasonably limit the number of queries to the Service per IP address within a given time interval, in order to prevent rapid ‘attacks’ on the Service."

karafecho commented 6 years ago

API design specs

karafecho commented 6 years ago

James delivered tables to Hao on 05.16.18

Deidentified integrated clinical feature tables can be found on Rockfish here:

/opt/RENCI/output/ClinicalFeatureVectors/1_0_0/PatientLevel/ /opt/RENCI/output/ClinicalFeatureVectors/1_0_0/VisitLevel/

GitHub repo for Smart API can be found here:

http://github.com/xu-hao/ddcr-api

Notes: (1) Reconsider separation of tables by year, but recognize the need to treat CMAQ data for 2010 and 2011 separately (i.e., independent distributions). (2) Reconsider statistical approach for DDCR Service functionalities 3 and 4.

karafecho commented 6 years ago

Tables moved to ebcr0.edc.renci.org on 5/17/18.

karafecho commented 6 years ago

Public access: ddcr.renci.org

xu-hao commented 6 years ago

Dependencies:

Need expert input on:

(1) Reconsider separation of tables by year, but recognize the need to treat CMAQ data for 2010 and 2011 separately (i.e., independent distributions). (2) Reconsider statistical approach for DDCR Service functionalities 3 and 4.

(Copied from @karafecho's post) @stevencox

xu-hao commented 6 years ago

Based on today's meeting with @karafecho