OHDSI / Arachne

Arachne Data Node web application
Apache License 2.0
1 stars 1 forks source link

Metadata format proposal #25

Open konstjar opened 5 months ago

konstjar commented 5 months ago

The idea is to have in analysis and in Strategus json file additional metadata that will be used for prepopulation of Submission form fields in ARACHNE datanode:

Dedicated file in .zip archive

File name: analysisMetadata.json

Content:

{
    "analysisName": "Simvastatin",
    "analysisType": "COHORT",
    "runtimeEnvironmentName": "Default Runtime",
    "dockerRuntimeEnvironmentImage": "ohdsi/r-hades:latest" 
    "entryPoint": "TestCDMConnector/main.R",
    "studyName": "My study"
}

Strategus format extension

Content:

{
  "metadata": {
      "analysisName": "Simvastatin",
      "analysisType": "COHORT",
      "runtimeEnvironmentName": "Default Runtime",
      "dockerRuntimeEnvironmentImage": "ohdsi/r-hades:latest" 
      "studyName": "My study"
  }
}
ablack3 commented 5 months ago

For darwin we absolutely need the option to provide this metadata along with the study. We do not want the data partner to change the execution environment in the UI. We essentially want the data partner to upload the study zip file, select the cdm database, and click run. They should not need to provide the entrypoint file (becuase how would they know this?) or execution environment. We want to provide as many details about study execution in the metadata as possible so the data partner just runs the study and that's it.

This should work just fine for Darwin I think. I think all Darwin studies (at least right now) will be of custom type.

{
    "analysisName": "Simvastatin",
    "analysisType": "CUSTOM",
    "runtimeEnvironmentName": "Default Runtime",
    "dockerRuntimeEnvironmentImage": "ohdsi/r-hades:latest" 
    "entryPoint": "TestCDMConnector/main.R",
    "studyName": "My study"
}

One question: is there a dependency between the two parameters runtimeEnvironmentName and dockerRuntimeEnvironmentImage?

How do these parameters interact with each other?

Also for darwin we do have both study name and study ID. Not sure if others need both of these as well.

Anyway basically it should work well I think.