srdc / tofhir

Mapping toolset to migrate/transform existing datasets to HL7 FHIR
Apache License 2.0
9 stars 1 forks source link

Refactor file format to content type #226

Closed Okanmercan99 closed 2 months ago

Okanmercan99 commented 2 months ago

image For this update, you should change the fileFormat field to contentType. Additionally, you should add the contentType field to all sources that are FileSystemSource .

Okanmercan99 commented 2 months ago

image You should add the contentType field to all sinkSettings that are FileSystemSinkSettings.

dogukan10 commented 2 months ago

I believe that contentType should be optional. In most cases, such as the example below, specifying contentType is redundant because the file extension (e.g., .csv) already indicates the content type:

"sourceBinding": {
  "source": {
    "jsonClass": "FileSystemSource",
    "path": "patients.csv",
    "contentType": "csv",
    "sourceRef": "pilot1-source"
  }
}

The use of contentType should be limited to scenarios where a file extension can represent multiple content types. For instance, a .txt file could contain either CSV or NDJSON data, so in that case, contentType would clarify the format (e.g., csv or ndjson) while reading the file.

WDYT @YemreGurses, shall we make it optional ?