biothings / pending.api

Set of standalone APIs built with the BioThings SDK for the Translator Project
https://biothings.ncats.io
Apache License 2.0
5 stars 13 forks source link

use semmeddb version number in /metadata #47

Open andrewsu opened 2 years ago

andrewsu commented 2 years ago

Current output from https://biothings.ncats.io/semmeddb/metadata looks like this:

{
  "biothing_type": "association",
  "build_date": "2021-08-31T23:23:50.843857+00:00",
  "build_version": "20210831",
  "src": {
    "semmed_parser": {
      "licence": "CC BY 4.0",
      "stats": {
        "semmed_parser": 114383742
      },
      "version": "2.0",
      "license_url": "https://skr3.nlm.nih.gov/SemMedDB/",
      "url": "https://skr3.nlm.nih.gov/SemMedDB/"
    }
  },
  "stats": {
    "total": 114383742
  }
}

under build_version, it appears like it is a date provided. Instead, we should use the official version number. For example, from https://lhncbc.nlm.nih.gov/ii/tools/SemRep_SemMedDB_SKR/SemMedDB_download.html the latest version currently is semmedVER43_R.

andrewsu commented 2 years ago

Note that it appears the official version number is also not unique -- the file with the July 2022 update appears to have overwritten the Feb 2022 update at https://data.lhncbc.nlm.nih.gov/umls-restricted/ii/tools/SemRep_SemMedDB_SKR/semmedVER43_2022_R_PREDICATION.csv.gz. So for this example, I think build_version should be set to semmedVER43_2022, and the build_date should be the date the file was downloaded.